Cue 0040! “29-6 So 


The British Journal for the 
Philosophy of Science 


an 








Published for the 

British Society for the Philosophy of Science 
: by Oxford University: Press 

1989 


CONTENTS 


29 
39 
69 
77 
83 
IOS 
I2I 
137 
145 
155 
167 
183 
289 


307 


323 


333 


"357 


ARTICLES 


THE ROLE TRUTH 

Jeremy Butterfleld 

PRIMARY QUALITIES ARE-SECONDARY QUALITIES TOO 

Graham Priest 

DISTANT ACTION IN CLASSICAL ELECTROMAGNETIC THEORY 

Brent Mundy 

SIMULTANEITY, CONVENTIONALITY AND EXISTENCE 

Vesselin Petkov 

TAKING MATHEMATICS SERIOUSLY? 

Joseph Zycinski 

PERCEPTION AND NEUROSCIENCE 

Grant Gillett 

IS CONFIRMATION DIFFERENTIAL? 

Edward Erwin and Harvey Siegel 

TWO PROBLEMS OF INDUCTION? 

John O'Neill 

TWO BRAINS, TWO MINDS? WIGAN'S THEORY OF MENTAL DUALITY 
Roland Puccetti 

STRUCTURAL ANALOGIES BETWEEN PHYSICAL SYSTEMS 

Peter Kroes 

FREGE, INFORMATIVE IDENTITIES, AND LOGICISM 

Peter Milne 

THE AUTONOMY OF PROBABILITY THEORY (NOTES ON KOLMOGOROV, RÉNYI, AND POPPER) 
Hugues Leblanc . 

A REFUTATION OF POPPERIAN INDUCTIVE SCEPTICISM 

Ken Gemes 

VECTORS AND CHANGE 

John Bigelow and Robert Pargetter 

THE METAMATHEMATICS—POPPERIAN EPISTEMOLOGY CONNECTION AND ITS RELATION 
TO THE LOGIC OF TURING’S PROGRAMME ` 
Jean-Roch Beausoleil 

NOTE ON ENTROPY, DISORDER AND DISORGANIZATION 

K. G. Denbigh 

PRAGMATIC TRUTH AND THE LOGIC OF INDUCTION 

Newton C. A. da Costa and Steven French 

AESTHETIC CONSTRAINTS ON THEORY SELECTION: A CRITIQUE OF LAUDAN 
James E. Martin 

AN ANOMALY IN THE D-N MODEL OF EXPLANATION 

Alex Blum 

THE NATURE OF REALITY 

Michael Redhead 

THEORY STRUCTURE AND THEORY CHANGE IN CONTEMPORARY MOLECULAR BIOLOGY 
Philip Kitcher and Sylvia Culp 

THE MANY FACES OF IRREVERSIBILTY 

K. G. Denbigh ` 


iv Contents 


443 
485 
519 


541 


127 
135 
223 
229 
233 
249 
255 
261 
275 
287 
369 
377 
391 


407 


413 


417 


ON THE NECESSITY OF RANDOM SAMPLING 
D. J. Johnstone 

BIOLOGICAL FOUNDATIONS OF PREDICTION IN AN UNPREDICTABLE ENVIRONMENT 
M. Marusic 

ERNST MACH LEAVES ''THE CHURCH OF PHYSICS'" 

John Blackmore 

CONNECTIONISM, MODULARITY AND TACIT KNOWLEDGE 

Martin Davies 


DISCUSSIONS 


GLYMOUR ON DEOCCAMIZATION AND THE EPISTEMOLOGY OF GEOMETRY 
Jane Duran 

CONVENTIONALISM IN PHYSICS 

W. T. Morris 

NOT VERY LIKELY: A REPLY TO RAMSEY 

D. E. Watt 

CAN A THEORY-LADEN OBSERVATION TEST THE THEORY? 

A. Franklin et al. 

QUINE ON THEORY AND LANGUAGE 

Nobuharu Tanji 

HOW TO DEFEND SCIENCE AGAINST SCEPTICISM: A REPLY TO BARRY GOWER 
Alan Chalmers 

TESTING FOR CONVERGENT REALISM 

Jerrold L. Aronson 

RUBEN AND THE METAPHYSICS OF THE SOCIAL WORLD 

Raimo Tuomela 

ZANDE LOGIC AND WESTERN LOGIC 

Richard C. Jennings 

SOME COMMENTS CONCERNING SPIN AND RELATIVITY 

Robert Welngard 

IF IT AIN'T BROKE, DON'T FIX IT 

Larry Laudan 

FIX IT AND BE DAMNED: A REPLY TO LAUDAN 

John Worrall 

VAN ROOIJEN AND MAYR VERSUS POPPER: IS THE UNIVERSE CAUSALLY CLOSED? 
Tom Settle ` 

EPIPHENOMENALISM AND MACHINES: A DISCUSSION OF VAN ROOIJEN'S CRITIQUE OF 
POPPER 

Davor Peénjak 

ON THE ORIGIN OF SPIN IN RELATIVITY 

Mendel Sachs 

A COMMENT ON MAXWELL'S RESOLUTION OF THE WAVE/PARTICLE DILEMMA 
Euan J. Squires ' 


ay 


Contents v 
REVIEW ARTICLES 


I85 THE NEW EXPERIMENTALISM 
Robert Ackermann 
I9I THE PHILOSOPHY OF QUANTUM MECHANICS 
Jeffrey Bub 
213  ANTI-REALISM AND LOGIC 
A. J. Dale 
219 MR KEYNES ON PROBABILITY 
F. P. Ramsey 


REVIEWS 


423  SizGEL, H.: Relativism Refuted: A Critique of Contemporary Epistemological Relativism. 
Review by Robert Nola 

557 Hooxway, CHRISTOPHER: Quine: Language, Experience and Reality. Review by Roger F. 
Gibson 


MISCELLANEOUS 


March ANNOUNCEMENT: BRITISH SOCIETY FOR THE PHILOSOPHY OF SCIENCE ANNUAL 
CONFERENCE 1989 

Dec. ANNOUNCEMENT: LAKATOS AWARD IN PHILOSOPHY OF SCIENCE 

Dec. ANNOUNCEMENT: BRITISH SOCIETY FOR THE PHILOSOPHY OF SCIENCE ANNUAL 
CONFERENCE SEPTEMBER 1990 


INDEX OF AUTHORS 


ARTICLES 


BEAUSOLEIL, JEAN-RocH. The Metamathematics-Popperian Epistemology, Connection and its 
relation to the Logic of Turing's Programme, 307 

BiGELOW, JOHN and PARGETTER, ROBERT. Vectors and Change, 289 

BLACKMORE, JOHN. Ernst Mach Leaves “The Church of Physics", 519 

BLUM, ALEX. An Anomaly in the D-N Model of Explanation, 365 

BUTTERFIELD, JEREMY. The Hole Truth, 1 

Da Costa, NEWTON, C. A. and FRENCH, STEVEN. Pragmatic Truth and the Logic of Induction, 
333 

DAVIBS, MARTIN. Connectionism, Modularity and Tacit Knowledge. 541 

DENBIGH, K. G. The Many Faces of Irreversibility, 501 

DzNBIGH, K. G. Note on Entropy, Disorder and Disorganixation, 323 

ERWIN, EDWARD and SIEGEL, HARVEY, Is Confirmation Differentlal?, 105 

GILLETT, GRANT. Perception and Neurosclence, 83 

HuGUEzs LEBLANC. The Autonomy of Probability Theory (Notes on Kolmogorov, Renyl, and 
Popper), 167 

JouNsTONE, D. J. On the Necessity of Random Sampling. 443 

Ken GEMES. A Refutation of Popperian Inductive Scepticism, 183 

KITCHER, PRILIP and CuLP, SYLVIA. Theory Structure and Theory Change in Contemporary 
Molecular Biology. 459 

KROES, PETER. Structural Analogies Between Physical Systems, 145 

MARTIN, JAMES, E. Aesthetic Constraints on Theory Selection: A Critique of Lauden, 357 

MARUSIC, M. Biological Foundations of Predictions tn an Unpredictable Environment, 485 

Munpy, BRENT. Distant Action in Classical Electromagnetic Theory, 39 

O'NEILL, JOHN. Two Problems of Induction?, 121 

PETER MILNE. Frege, Informative Identities, and Logicism, 155 

PETKOV, VESSELIN. Simultaneity, Conventionality and Existence, 69 

Priest, GRAHAM. Primary Qualities are Secondary Qualities too, 29 

Puccett1, ROLAND. Two Brains, Two Minds? Wigan's Theory of Mental Duality, 137 

REDHEAD, MICHAEL. The Nature of Reality, 429 

ZYCINSKI, JOSEPH. Taking Mathematics Seriously?, 77 


REVIEW ARTICLES 


ACKERMANN, ROBERT. The New Experimentalism, 185 
Bus, JEFFREY. The Philosophy of Quantum Mechanics, 19x 
Date, A. J. Anti-Realism and Logic, 213 

RAMSEY, F. P. Mr Keynes on Probability, 219 


REVIEWS 
GIBSON, ROGER F. Hookway, Christopher: Quine: Language, Experience and Reality, 557 
Noa, ROBERT Stegel, H: Relativism Refuted: A Critique of Contemporary Epistemolo- 


gical Relativism, 423 


vili Index of Authors 
DISCUSSIONS 


ARONSON, JERROLD L. Testing for Convergent Realism, 255 

CHAMBERS, ALAN. How To Defend Science Against Scepticism: A Reply to Barry Gower, 249 

DuRAN, JANE. Glymour on Deoccamization and the Epistemology of Geometry, 127 

FRANKLIN, A. et al. Can A Theory-Laden Observation Test The Theory?, 229 

JENNINGS, RICHARD C. Zande Logic and Western Logic, 275 

LAUDAN, LARRY. If It Ain't Broke, Don't Fix It, 369 

Morais, W. T. Conventionalism in Physics, 135 

PECNJAX, Davor, Epiphenomenalism and Machines: A Discussion of Van Rootjen’s Critique of 
Popper, 407 

Sacus, MENDEL. On the Origin of Spin in Relativity, 413 

SETTLE, Tom. Van Rooijen and Mayr versus Popper: Is The Universe Causally Closed?, 391 

Squires, EUAN. A Comment on Maxwell's Resolution of the Wave/Particle Dilemma, 417 

TAN], NOBUHARU, Quine on Theory and Language, 233 

TUOMELA, RAIMO. Ruben and the Metaphysics of the Social World, 261 

WATT, D. E. Not Very Likely: A Reply to Ramsey, 223 

WEINGARD, ROBERT. Some Comments Concerning Spin and Relativity, 287 

WORRALL, JOHN. Flx It and Be Damned: A Reply to Laudan, 377 





Brit. J. Phil. Sci. 40 (1989), 1-28 Printed in Great Britain 


The Ho's Truth 


JEREMY 3UTTERFIELD 


1 Introduction 

2 The Threat 

3 Determinism in terms of Models 
4 Models and Worlds 

5 Essentialism 

6 Denying Transworld Identity 


I INTRODUCTION 


Earman and Norton have recently argued that substantivalism poses a threat, 
not just to the truth of determinism, but also to its possibility (Earman and 
Norton [1987]). The aim of this paper is to overcome this threat: I shall argue 
that an attractive version of substantivalism can admit an attractive version of 
determinism. To make my discussion self-contained, this Section describes the 
background to the threat; and Section 2 describes it. Then I survey ways to 
escape It, and advocate one of them. It turns out that the threat raises general 
issues about determinism and substantivalism. It causes trouble for a common 
definition of determinism; and it forces substantivalism to decide on a doctrine 
about the 'transworld identity' of spacetime points—a matter on which 
substantivalists are often silent (e.g., Butterfield [1984]). Section 3 considers 
two definitions of determinism for spacetime theories: a common one, which is 
violated just as Earman and Norton claim; and a less common one, extracted 
from general relativity texts, which is not violated. Section 4 assesses these 
definitions as explications of the basic idea of determinism. Urging the merits of 
the second definition leads into the topic of transworld identity. I argue that in 
order to escape fully from Earman and Norton’s threat, we need either to be 
essentialists (Section 5) or to deny that one and the same spacetime point can 
occur in two worlds (Section 6). Maudlin [1988] has urged the first option; but 
. I prefer the second option. It fits the second definition of determinism better 
than essentialism does; it also has other advantages unrelated to determinism. 

Substantivalism is the claim that our physical theory commits us to the 
existence of spacetime points, and perhaps to spacetime as the set or 
mereological fusion of all the points. The popularity of this claim reflects the 


2 Jeremy Butterfield 


rise of scientific realism from the mid-1960's onwards. For scientific realism 
holds that one is committed to believing in the existence of those entities that 
are ineliminably referred to or quantified over by one’s best scientific theories. 
And our best spacetime theories are almost always presented as quantifying 
over spacetime points—with never a hint of how to eliminate such quantifica- 
tion. As an aspiring scientific realist, I find this version of substantivalism 
attractive. 

Determinism is a feature of a theory. We shall consider only theories of 
spacetime structure, and of matter fields in spacetime. Such a theory specifies a 
set of models; each makes the theory true, but they differ among themselves in 
other ways. Thus each model comprises a manifold and various geometric 
objects on it—metric and matter flelds, and connections; the objects obeying 
the laws of the theory (partial differential equations). But the models differ 
among themselves in other ways, in particular on initial and boundary 
conditions. Thus these models represent physically possible worlds, according 
to the theory. (I will discuss later whether we can take the models to be such 
worlds. The basic !dea of determinism is this: a spacetime theory is 
deterministic if any two of its models that agree on the physical state at one 
time agree on the physical state at any other time. This idea needs to be made 
precise: 'agree on the physical state' needs to be spelt out in terms of 
diffeomorphisms and geometric objects, and 'at a time' needs to be spelt out in 
terms of time-slices defined by the spacetime's temporal structure; see Section 
3 for details. For the moment, note only that the idea can be generalized: the 
spacetime regions on which the physical states are determined and on which 
they do the determining, need not be respectively, the entire spacetime and a 
time-slice. So one gets various precise formulations of determinism: the notion 
of determinism will be weaker the larger the region on which one assumes 
agreement, and the smaller the reglon on which agreement is implied. 

Earman and Norton's threat is essentially that a substantivalist must make a 
distinction, which physics does not see and which rules out determinism. The 
threat arises from the fact that spacetime theories have pairs of models, the two 
models within a pair agreeing on a region of spacetime, but differing elsewhere. 
In the practice of physics, such pairs can often be regarded as representing the 
same physically possible world; and when they cannot be so regarded, one of 
course concludes that determinism, in the sense of determination by data on 
that region, fails for the theory concerned. But, say Earman and Norton, a 
substantivalist must regard any two such models as representing distinct 
physically possible worlds; and must therefore deny determinism at a stroke. 
Furthermore, the threat is a strong one in the sense that even very weak 
formulations of determinism are supposed to be ruled out: the region of 
agreement can be very extensive—all of spacetime less a small ‘hole’. 

One response is of course to let determinism fail. And Earman and Norton 


have no special brief to defend determinism in spacetime theories. Indeed “4 


The Hole Truth 3 


Earman's masterly recent book details the ways in which determinism fails in 
such theories ([1986]; Chapters 3, 4, 10). But Earman and Norton hold that if 
` determinism fails ‘it should fail for a reason of physics’ ((1987]; p. 524)—of the 

kind detailed in Earman’s book. So what is wrong with substantivalism is not 

simply that it rules out determinism, but that it does so at a stroke for a wide 
class of theories—and furthermore without affecting the predictions of the 
theories. 

I agree with this stance: and so I must face the threat. 


2 THE THREAT 


To spell out the threat, I shall first state a piece of mathematics and then 
Earman and Norton’s interpretative claims. 

Assume we have a spacetime theory whose models comprise a manifold and 
various geometric objects, « M,O,» . Recall that a diffeomorphism d on M is a 
smooth 1-to-1 map from M onto M; and that d induces a ‘drag along’ map d* on 
the geometric objects. Thus the map d on points induces a map on curves, and 
so on directional derivatives i.e. vectors, and so on other tensors; this induced 
map is written d*. For example, suppose one drags a metric h by d, and 
according to h the distance between points p and qis A (for there to be a unique 
distance, one takes p and q close enough to be connected by a unique geodesic). 
Then, according to the dragged metric d*(h), the distance between points d(p) 
and d(q) is 4. Similarly for dragging geometric objects other than a metric. In 
short, the image-points have the properties and relations to one another, 
according to the dragged geometric objects, that the argument-points have, 
according to the original geometric objects. 

Assume also that our theory satisfies an 'active' version of general 
covariance as follows: 


(GC) If «M,0O,» is a model, and d is a diffeomorphism of M onto M, then the 
à dragged along tuple <M,d*(O,)> is also a model. 


Now on any manifold M, and any neighbourhood H (for ‘hole’) of M, there are 
arbitrarily many diffeomorphisms of M onto M that are identity outside H and 
differ from identity within H. Thus given a spacetime theory satisfying (GC), 
: and a model of it, we can obtain arbitrarily many other models. We do so by 
smoothly sliding the geometric objects around within a hole, and having this 
' sliding join smoothly the edge of the hole. 
. Wecan picture the way a diffeomorphism d, that is Identity outside a hole H, 
. acts on a model to produce another. Cf. Figure 1. pis outside H and d sends p to 
. itself; within H, d maps q tor, r to s etc. Thus if q is distance 2 from p in the given 
` model, r is distance 2 from p in the new model. Similarly, s is distance 1 in the 
^ new model, inheriting this property from its pre-image. 


4 Jeremy Butterfield 





X E / 
X Z 
d Be 
pp e - 
qr h 
r > s èp 
Figure 1 


I turn to Earman and Norton’s interpretative claims. There are three: 


(a) spacetime theories obeying (GC) form a wide and important class 
([1987]; p. 517-8); 

(b) substantivalism regards two models related by a ‘hole diffeomorphism’ 
as representing distinct physical possibilities ([1987]; pp. 521-2); 

(c) determinism is a matter of a single physical possibility being determined 
by the specification of the geometric objects on some region that is less 
than the whole manifold (determinism being weaker, the larger the 
region). 


It follows that substantivalism rules out at once even very weak determi- 
nisms for all such theories. (The argument is essentially Einstein's famous 'hole 
argument'; it played an important role in his search for general relativity. For 
the history, see Earman and Glymour [1978]; Stachel [1980]; Norton [1984]; 
Torretti [1983]; pp. 163-8; Norton [1987]: Butterfield [1987]; $4.) 

Thus the basic idea of the threat is this: substantivalism holds that spacetime 
points can occur in different physical possibilities with a permutation of each 
other’s properties and relations, and it thereby rules out determinism: for it 
allows there to be a permutation off the determining region, with no 
permutation on the determining region. Clearly (a) and (b) are crucial in 
supporting the first statement here, about what substantivalism holds. And in 
responding to the threat, I shall accept (c) and question only (a) and (b). I shall 
question them in Section 4 onwards; in fact Sections 5 and 6 will each describe 
a way to deny the statement. Before that, I shall discuss the definition of 
determinism, although not in a way that questions (c): see Section 3. 

Let me first set aside one strategy for responding to the threat: it saves 
determinism by being austere about physical reality. It comes In two varieties: 
instrumentalism (‘Vienna’) and constructive empiricism (‘Princeton’). Vienna 


holds that the only claims of our spacetime theories that need interpretation as _ . 


about physical reality are the strictly observational claims, and two models 
related by a hole diffeomorphism as in Figure 1 are representations of the same 
observable reality—say, because observable reality is a matter solely of 
spacetime coincidences of objects. (Something like this seems to have been one 
of Etnstein’s early reactions to the hole argument; cf. Earman and Glymour 


tad 
x, 


The Hole Truth 5 


[1978], pp. 273-6; Norton [1987]; $4.) Vienna faces problems of principle and 
problems of detail: how, for example, can one make sense of spacetime 
coincidences except in terms of intersecting supports for matter fields? These 
problems prompt one to adopt Princeton: the only claims of our spacetime 
theorles that we need to believe are the strictly observational claims, so 
substantivalism is false, and determinism is allowed for. I shall ignore this 
strategy, in both its varieies: a sclentific realist will not want to have such a 
short way with the threat. 


3 DETERMINISM IN TERMS OF MODELS 


There are two aspects to the reconciliation of substantivalism and determi- 
nism: technical and philosophical. The technical aspect concerns the fact that 
the threat (as presented by Earman and Norton, and by Einstein) does not use 
an exact definition of determinism. So perhaps there is an exact definition that 
is not violated by a pair of models related by a hole diffeomorphism. In this 
Section, I shall argue that this is indeed so. Although such a pair violates a 
common definition of determinism, they do not volate another definition that 
can be extracted from general relativity. Furthermore, in several cases where 
the verdicts of the common definition as to whether determinism holds or fails 
are intuitively right, the second definition makes the same verdicts. 

The philosophical aspect concerns whether this definition can claim to be as 
good a definition as the more common one. If not, the technical victory will 
seem hollow. This aspect relates closely to Earman and Norton’s claims (a) and 
(b); I take it up in Section 4. 

Turning to the technical aspect, note first that a theory’s satisfying active 
general covariance, (GC), amounts to its treating metric structure, as coded by 
metric fields and connection, on a par with matter fields. The theory constrains 
them only by requiring them to satisfy feld equations, just as for matter fields. 
They are not to be postulated ab initio on the manifold and held flxed in every 
model of the theory. For example, suppose one of our theory's models has as its 
manifold Rt, with the standard connection, D say; thus according to D, the 
unique geodesic between (0,0,0,0) and (1,0,0,0) passes through (4,0,0,0). Let 
the model be <R*, D, O,>. Now suppose that there is some non-standard 
connection, D’, on R$, and other geometric objects O,’, such that the tuple 
<R*,D’,0,'> satisfies the field equations. Then by Earman and Norton's 
assumptions, this tuple is to count as a model of our theory, notwithstanding 
the fact that it uses a non-standard connection on R*. 

General relativity prompts this way of treating metric structure. The reason 
lies in the existence of non-isometric models. Thus we say that two models 
<M,QO > and «M',O/ > are isometric if there is a diffeomorphism d between 
. their manifolds such that those of the O; that code the first model's metric 
structure (the metrics and connection) are dragged by d* to coincide with the 


6 Jeremy Butterfield 


corresponding objects among the Oʻ: for all image-points, d(p), we have 
d*(0;)(¢_.)) =O; (d(p)). We call such a diffeomorphism an isometry. (An isometry 
that also drags the other geometric objects, the matter fields, of the first model 
to those of the second, is called an isomorphism. We can consider such maps 
on a single model <M,0,>; in that case we speak, respectively, of a metric 
symmetry, and of a symmetry.) Now, in general relativity, the metric structure 
is affected by matter, and the global topology need not be that of R*. So general 
relativity has non-isometric models; indeed, it has non-diffeomorphic models. 
And unless any two models of one’s theory are isometric, one cannot write 
one’s theory as postulating a single manifold plus metric structure, the same in 
every model—a fixed canvas on which matter flelds get painted. On the other 
hand, in most classical and special relativistic theories, metric structure is not 
affected by matter and it is usual to assume that the global topology is that of 
R^. So all models are isometric; and accordingly such theories can be presented 
as having a single manifold and metric structure (in all cases, R* with the 
standard connection); thus symmetries are emphasised and the freedom to 
make hole diffeomorphisms is suppressed. (Indeed, they usually are presented 
in this way.) 

This Section will show that whether all of a theory's models are isometric is 
closely related to whether the common definition of determinism is suitable for 
it. Thus I shall first state the common definition (deriving from Montague 
[1974], and Earman (most recently, [1986], p. 24)). Earman describes how 
this common definition suits some familiar spacetime theories that postulate a 
single manifold and structure, the same in every model; that is to say, it applies 
to them and in so far as one has intuitions about whether these theories are 
deterministic, its verdicts are intuitively right. However, it does not suit 
theories satisfying (GC), in particular general relativity: it is automatically 
violated by models related by a hole diffeomorphism. I shall then give a second 
definition, which suits theories satisfying (GC): it is not violated by such 
models. In the case of the familiar single-manifold-plus-structure theories, this 
definition is weaker than the first; but for most of these theories, it in fact makes 
the same verdicts as the first definition. That is to say, it does not rule as 
deterministic those familiar theories, that the first definition rules indetermi- 
nistic. To that extent, an advocate of the second definition can explain the 
attraction of the first. 

I do not claim that Earman and Norton's position turns on advocacy of the 
first definition. Thetr paper does not define determinism precisely; and they are 
well aware that the first definition does not suit theories satisfying (GC). 
Indeed, we shall see below that the second definition is similar to one that 
Earman himself suggests when he discusses the violation of the first deflnition 
by Leibnizean spacetime—a violation that is similar to the violation by (GC)- 
theories. However, Earman may think that a substantivallst is committed to 
the first definitlon as the correct explication of determinism in terms of models. 


ind 


a 


The Hole Truth 7 


If that were so, substantivalists who believe general relativity, or any theory 
satisfying (GC), would indeed rule out determinism at a stroke. But I shi! deny 
that it is so. 

There is another benefit from examining these definitions of determinism. It 
is easy to get the impression that avoiding the threat is a matter simply of the 
idea that since the field equations of a theory are powerless to pick out one 
model from a set of isomorphic replicas, the uniqueness of solutions associated 
with determinism should only be ‘upto isomorphism’. And this impression is 
false. We shall see that the first definition makes solutions unique only upto 
isomorphism in a precise sense; but it is violated by models as in Figure 1—so 
' avoiding the threat, involves more than just this idea of ‘only upto 
isomorphism’, My claim will be that it involves the second definition of 
determinism. : 

The first definition is easy to motivate. We want determinism to mean that 
agreement on regions of a certain kind (typically sandwiches or slices) forces 
agreement elsewhere. But there is no meaning to a vector or tensor at a point 
in one manifold being the same as a vector or tensor at a point of another 
manifold. So we spell out agreement in terms of a diffeomorphism dragging the 
geometric objects In one model into coincidence with the geometric objects of 
the other. To require that any diffeomorphism giving local agreement also 
gives global agreement will be too strong: the definition would be violated by 
paradigm deterministic theories like electromagnetism in Minkowski space- 
time. The natural weakening for single-manifold-plus-metric-structure theor- 
les is to require that the isometries giving local agreement also give global 
agreement. 

` We would also like to cover the case of a theory (like general relativity) not 
all of whose models have manifolds that are diffeomorphic to one another. The 
natural tactic is to make the statement conditional on the existence of a global 
. diffeomorphism between the models manifolds, and on the manifolds 
; containing regions of the right kind—and if the regions do not exist, 
. determinism will be vacuously and harmlessly true; and to require this 
diffeomorphism to drag into coincidence whatever ''absolute" geometric 
objects there are (metrics and connection in single-manifold-plus-metric- 
Structure theories). Thus we write: ` 


Dm1 A theory with models « M,O, is S-deterministic, where S is a kind of 
region that occurs in manifolds of the kind occurring in the models, iff: 
given any two models <M,0,> and <M’,O,’> and any diffeomor- 
phism d from M onto M’, that drags any absolute objects on M into 
those on M’, and any region S of M, of kind S: 

if d(S) is of kind S and also d*(O)) 2 O/ on d(S), then: 
d*(0,) 2 O/ throughout M'. 


(Montague and Earman (ibid.) give similar definitions.) 


" 


8 Jeremy Butterfield 


Note that the regions need not be submanifolds (slices are not). And as it 
stands, the definition does not require that the kind ofregion we are concerned 
with be solely a matter of the regions’ differential structure and geometric 
objects; though if the kind is preserved under diffeomorphism, the condition 
above that d(S) be of kind S will be conveniently satisfied automatically. 

However for the discussion of determinism as uniqueness of solutions, it is 
best to assume that the kind is preserved under diffeomorphism. Once we 
assume this, we can see how this definition makes the uniqueness of solutions 
associated with determinism hold only ‘upto tsomorphism’: roughly speaking, 
if <M,O,> and «M',O/ > are models of the theory matching on a region and 
so everywhere, so also is any pair of models isomorphic to these two. To be 
precise, suppose that <M,O,> and <M’,O;’> obey the following condition: 


(f) each contains at least one region of the right kind, and there are 
diffeomorphisms d between M and M’, and for any such region S on M we 
have: any such diffeomorphism d satisfying d*(O) — Oʻ on d(S) also satisfies 
d*(O,) =O; everywhere. 


Then it follows that if <N,P,> and <N’,P;’> are isomorphic respectively to 
«MO,» and «M',O/» then «N,P,» and «N',P/- also satisfy the above 
condition (f) (This follows immediately from the kind of region being 
preserved under diffeomorphism; to prove it, think of making the diagram of 
maps between the four manifolds commute.) 

Dm1 gives intuitively right verdicts about whether determinism holds, for 
some familiar theories postulating a single manifold and structure. Namely, 
theories using a classical spacetime with or without absolute rest, and theories 
using Minkowski spacetime; where the regions S are time-slices, or thin 
sandwiches, across the manifold. For details, cf. Earman ([1986], pp. 23-40; 
58-61), or Butterfield ([1987], pp. 15-17). 

However, Dm1 is violated by a theory with two models related by a hole 
diffeomorphism; and thus by any (GC)-theory. For let <M,0,> and «M,0/» 
be related by a hole diffeomorphism d which is identity on S—as extensive as 
you like. The identity map i on M is a diffeomorphism between the models with 
i*'(0) =O/ on S; while 1*(0)) #0 in the hole, M-S. And since there are no 
absolute objects that i is required to drag into coincidence, 11s a counterexam- 
ple to Dm1. This vindicates the point made above that avoiding Earman and 
Norton's threat is not simply a matter of making solutions unique only 'upto 
isomorphism’: Dm? does that, but is violated by such pairs of models. 

But we should not conclude that any (GC)-theory is indeterministic, in every 
decent sense ofthat word. Some general relativity texts discuss the initial value 
problem for general relativity—-which has no single manifold for its models 
and which satisfies (GC); and they prove a 'uniqueness upto isomorphism' 
result, suggesting that determinism in some decent sense holds good. And 
indeed, one can extract from these texts a definition of determinism, similar to 


E 


The Hole Truth 9 


Dm1, according to which general relativity is deterministic. (This sense will 
apply equally well to formulations of special relativity satisfying (GC), since 
special relativity is the special case of flatness; and we shall see that it also 
applies to (GC)-formulations of classical spacetime theories.) For the details of 
extracting this definition from general relativity, see Butterfield ([1987], pp. 
17-19, 26-9). The definition is: 


Dm2 A theory with models « M,O, is S-deterministic, where S is a kind of 
region that occurs in manifolds of the kind occurring in the models, iff: 
given any two models <M,0O,> and <M’,O,’> containing regions S, S’ 
of kind S respectively, and any diffeomorphism « from S onto S’: 
if 4*(0,) - O/ on a(S) « S', then: 
there ts an isomorphism f! from M onto M’ that sends S to S’, Le. 
f*(0) —«O/ throughout M’ and f(S) «S. 


This differs from Dm1 in two ways. First, the diffeomorphism « assumed to 
exist (i.e. given by the antecedent) need not be global; it need only be defined on 
S. Secondly, the (global) isomorphism f that the consequent asserts to exist 
need not extend a; that is, it need not agree with a on «’s domain. Thus even if 
the diffeomorphism a is given as global, Dm2 does not reduce to Dm1, with its 
single diffeomorphism in antecedent and consequent. In this case, Dm2 is 
weaker than Dm1, since its consequent replaces reference to « by an existential 
generalization. Indeed, we cannot require that $ extend «, on pain of having 
Dm2 violated by a pair of models related by a hole diffeomorphism; Le. on pain 
of having Dm2 rule any (GC)-theory indeterministic. For given one model, the 
identity map i on it is global, so that if B extends i, then f =i, and Dm2 reduces 
to Dm1 and is violated by such a pair of models. However, as it stands, Dm2 is 
not violated by such models and thus allows (GC)-theories to be deterministic. 

I thus claim that Dm2 is the general definition of determinism implicit in 
modern presentations of general relativity's initial value problem; and more 
generally, is suited to (GC)-theories. There are two further points in its favour. 
First, I described above how Dm1 makes solutions unique only upto 
isomorphism. One can check that Dm2 has a similar property. That is, if a pair 
of models satisfies the weaker variant of (t) obtained in the obvious way by 
existential generalization, then so does another pair of models given as 
isomorphic to the first two. 

Secondly, if we apply Dm2 to familiar single-manifold-plus-structure 
theories, it very often makes the same verdicts as Dm1. Thus an advocate of 
Dm2 can in a sense explain the attractions of Dm1. Thus one can check that 
Dm2 and Dm1 agree about determination by the state on a slice or a sandwich, 
for theories using classical or Minkowski spacettme (Butterfield [1987], pp. 
29-30). 

I agree that there are other less familiar single-manifold-plus-structure 
theories that Dm1 rules indeterministic, and Dm2 rules deterministic. One 


IO Jeremy Butterfield 


example—Leibnizean spacetime—is relevant to us. For in discussing it, 
Earman suggests a definition of determinism very close to Dm2. Letbnizean 
spacetime is, in short, classical spacetime without absolute rest and without a 
connection. This means that whatever laws governing matter one adds, there 
are metric symmetries that are identity upto time t — O and differ from identity 
thereafter: for example, one can smoothly ‘turn on’ a rotation after t — 0. Now, 
the following condition is compelling: if d is a metric symmetry of a model 
<M,O,>, then «M,d*O,» is also a model, i.e. satisfies the theory’s field 
equations; (for the argument, cf. Earman [1986], p. 26). This implies that Dm1 
will fail, even in the weak form in which the entire past before t — O determines 
the future. For given a model, we can apply a smoothly turned on rotation to 
produce another dragged-along model (with the same manifold). Dm1 fails 
because the identity map on the manifold will give matching of the regions 
t<0, but not of t>O (Earman [1986], pp. 24-9; Stein [1977], pp. 5-6). 
However, Dm2 need not fail; for with the existential generalization in its 
consequent, it does not require the global match to be given by the same map 
as was Initially picked for the match on region S (in the example, by the identity 
map)—the global match can be given by the symmetry itself. And indeed 
Earman writes down a special case of Dm2, with the map « global and S the 
absolute simultaneities (ibid, p. 53, fn. 1). 

Earman goes further. He claims that if Leibniz was confronted with the 
violation of Dm1 by theories using his preferred spacetime structure, he would 
object that Dml presupposes substántivalism. More precisely, Leibniz's 
rejection of substantivalism would make him hold that isomorphic models 
(perhaps even with manifolds built from different sets of points) are ‘Just 
different modes of presentation of the same physical reality' (ibid., p. 28). And 
so Leibniz would not regard the above violation of Dm1—by a pair of models 
related by an isomorphism—as a violation of determinism, in the intuitive 
sense of the determination of a single physical possibility by the physical state 
on a region. Earman also hints that for Leibniz, Dm2 is better in this regard 
(ibid., p. 53, fn. 1). And certainly Dm2 will not be violated by a pair of 
isomorphic models with an isomorphism matching regions of kind S. 

Nevermind the historical accuracy of Earman's clatms about Leibniz. (I for 
one would not question this.) Certainly, Dm1 does presuppose substantivalism 
in the above sense; and Leibniz's point here clearly leads in to Earman and 
Norton's threat. On the other hand, it is far from clear that Dm2 escapes 
substantivalism. Like Dm1, it quantifies over points. And its existential 
generalization over isomorphisms looks like some kind of freedom to match or 
identify points across models, rather than a freedom from commitment to 
points. 

So far I have urged the merits of Dm2, while only discussing models. 
Questions remain about whether Dm2 is as good a definition as Dm1. In 
particular, does it explicate the basic idea of determinism as faithfully? I shall 


The Hole Truth II 


assume that the basic idea of determinism is that a single physically possible 
world is specified by the physical state on a certain region of spacetime: given 
the state on the region, there is only one physical possibility (cf. claim (c) of 
Section 2). Since the two definitions are cast in terms of models <M,QO > 
rather than possible worlds, this assumption means that I need to consider the 
relation between models and possible worlds: see Section 4. 

Of course, some are suspicious of possible worlds, and will deny that we: 
should cast determinism in terms of them. Determinism is to be simply a 
feature of the class of models, in logicians' sense, of a precise formulation of a 
spacetime theory: a feature of technical interest, since it relates to the existence 
and uniqueness of solutions. This position is in effect another strategy 
('Harvard") for responding to the threat. It promises to save substantivalism by 
being austere about possibility: if determinism is a purely technical feature, a 
scientific realist should presumably allow it to fail when it is threatened by 
substantivalism. 

I cannot endorse this strategy, since I do not share its austerity about 
possibility. But for those who do, the discussion above remains pertinent: even 
if determinism is a purely technical feature, it is worth seeing how it can be 
defined so as not to be violated by a pair of models related by a hole 
diffeomorphism. 


4 MODELS AND WORLDS 


This is not the place to defend possible worlds. For a recent account of their 
usefulness in analysing or explicating philosophically interesting notions, and 
of the debate about their nature, cf. Lewis [1986]. Suffice it to say here that in 
the present application of possible worlds, as in so many others, we can very 
largely remain neutral on the debate about their nature; in particular, about 
whether Lewis’ controversial realism is right. 

To respond to the threat and reconcile substantivalism with determinism 
cast in terms of worlds, I need to address three questions. What is the relation 
between models and worlds? And once determinism is cast in terms of worlds, 
is Dm2 as good a definition as Dm1? (If not, the technical victory of Section 3 
will be hollow.) And what should a substantivalist say about the transworld 
identity of points? The discussion of each question will lead into the next. I shall 
treat the first two questions in this Section; the third in Sections 5 and 6. A 
word of warning: this Section will emphasis the basic idea of determinism as 
the state on a reglon specifying a single physically possible world. But I do not 
claim that how well the definitions Dm1 and Dm2 explicate this idea is the 
only yardstick for assessing them. A definition may bring theoretical gains 
which more than make up for some conflict with the basic idea. Indeed, we 
shall see an example in Section 6. The position I advocate there makes the basic 


I2 Jeremy Butterfield 


idea automatically true, in a trivial way, while Dm2 is not. No worrles: overall, 
Dm2 is the best definition. 

Ineed to distinguish between physically possible worlds, and models (tuples) 
that purport to represent them. This will not prejudge the debate about realism 
concerning possible worlds: it may be that possible worlds are ersatz objects, it 
may bethat they are set-theoretic tuples. But even if this is so, we can ask about 
the relation of representation between tuples of the kind, <M,0,>, that occur 
in discussion of spacetime theories and physically possible worlds. In partcular, 
Earman and Norton's threat means we must consider whether this relation is 
one-one, or one-many (some tuple represents more than one world), or many- 
one (some world is represented by more than one tuple), or many-many (both 
one-many and many-one). 

We need to set aside one issue bearing on this question: the issue of whether 
the physics expressible in terms of geometric objects on spacetime determines 
all the facts about a physically possible world. To say Yes is to espouse a strong 
determinationist physicalism— strong because all the facts are not only 
determined by physics, but by spacetime physics. To say No means that in 
general one tuple «M,O,7 represents different physical possibilities: so the 
representation relation is not one-one, but is one-many (it may also be many- 
one). To simplify discussion, I shall set aside this issue; that Is, I shall write as if 
the answer is Yes. 

Having set aside the issue of physicalism, it follows that any response to the 
threat must address the following question. (It is a special case, convenient for 
us, of Earman and Norton’s ‘Leibniz equivalence’ ([1987], p. 522)): 


(Same?) Suppose given a theory and 2 isomorphic models <M,0,> and 
«M',O/ >, with the same base-set. That is, M and M' are built from 
the same set of points, and there is a diffeomorphism d of M onto M', 
dragging the Q; into the O/' (all objects, not Just metric fields). Note 
that these models may differ on which properties and relations 
(coded by the objects) are ‘painted’ on which points of the common 
base-set; cf. Figure 1. Does each model represent the same physically 
possible world? 


How one answers this question will affect whether one allows or rules out 
determinism for spacetime theories satisfying (GC). The situation is broadly as 
follows. 

If we answer Yes to (Same?), then hole diffeomorphisms from a manifold 
onto itself do not threaten determinism. That is, they do not threaten the basic 
idea of determinism: that a single physically possible world is specified by the 
facts on a certain region of spacetime. For answering Yes to (Same?) implies 
that only one world is at issue, and there is thus no prospect of two worlds 
disagreeing on a hole. 

If we answer No to (Same?), there are two options. First, (Each): We say that 


The Hole Truth I3 


in general each of the two models represents a different physically possible 
world. If we say this, the basic idea of determinism will be violated by two such 
models. Secondly, (One): We say that at most one of the two models represents a 
physically possible world. Sections 5 and 6 will each provide a reason for 
option (One). For the moment note only that (One) blocks the inference from 
answering (Same?) by No, to the violation ofthe basic idea of determinism. For, 
just as on the Yes answer, there is at most one world at issue, and thus no 
prospect of two worlds disagreeing on a hole. 

Earman and Norton say in effect that the practice of physics is to answer Yes 
to (Same?) ([1987], p. 522, fn. 1). They also say that substantivalism must 
answer No; and they take this to mean (Each)—they Ignore the second option, 
(One) (cf. their claim (b), in Section 2). 

I agree with the first claim. To be precise, physics texts do not usually 
distinguish models and physically possible worlds. But the more careful 
modern texts of general relativity do so, and they answer Yes. Indeed, they go 
further: any two isomorphic models, regardless of whether their base-sets are 
the same, both represent the same world. The base-sets can even be disjoint 
(Hawking and Ellis [1973], pp. 56, 227-8; Sachs and Wu [1977], p. 27; cf. 
Leibniz at the end of Section 3). 

Ialso agree with Earman and Norton that substantivalism must answer No. 
For belief in points implies belief that a possible world fixes its population of 
points and their properties and relations; so models distributing such 
properties and relations differently simply cannot both represent the same 
world. (So I admit that there is a clash between the practice of physics, and the 
position I favour; but I shall hold that this clash is less bad than one facing 
Earman and Norton.) But of course I deny that substantivalism must endorse 
(Each). It can endorse (One); and besides, I shall argue that it has good reason 
to do so, quite apart from securing the possibility of determinism. 

But before arguing for (One), I should assess the merits of Dm1 and Dm2 as 
explications of the basic idea of determinism. Since these definitions are cast in 
terms of models, assessing their merits requires us to consider the relation 
between models and worlds; and thus to consider the various answers to 
(Same?). I shall now argue that whether we answer Yes or No to (Same?), Dm2 
is a better explication of the basic idea that Dm1. 

Suppose we answer Yes; and suppose that, like the physics texts (and like 
Leibniz), we go further and say that any two isomorphic models, even with 
disjoint base-sets, represent the same world. Then determinism must be a 
matter of the state on a region specifying an isomorphism class of models of the 
theory. We must surely interpret this as: if the state on a region is given as the 
‘same’ in any two models of the theory, then the models are isomorphic. At first 
sight, this last can be made precise either as Dm1 or as Dm2; so that this Yes 
answer can accept both definitions as explicating the basic idea. However, we 
remarked above that since answering Yes to (Same?) makes two models 


14. Jeremy Butterfield 


related by a hole diffeomorphism represent the same world, this answer should 
suffice to prevent hole diffeomorphisms threatening determinism, defined in 
terms of worlds—there is only one world at issue. Yet Dm1 is violated by such 
models. So this Yes answer should prefer Dm2 as explicating the basic idea. (A 
similar point holds for the position that answers Yes to (Same?), but does not 
‘go further’; for Dm1 is violated by models built on the same base-set.) 

So if we answer Yes to (Same?), Dm2 is preferable to Dm1. I have no brief to 
defend the common definition Dm1; so I see here no problem for the Yes 
answer, though champions of Dm1 might do so. 

There is however a problem facing this Yes answer, that relates to the 
distinction between models and worlds; or rather, a challenge. If isomorphic 
models represent the same possible world, each model has redundancy in the 
way It represents. The membership of the base-set, and the way the properties 
and relations are distributed among these members, are an artefact of the 
representation. (Contrapositively: If they are not an artefact, then specifying a 
world involves specifying a base-set and a distribution of properties on it—so 
that one must answer No to (Same?).) So one faces a challenge: rewrite 
spacetime theorles so as to eliminate the redundancy: that is, give a direct 
account of worlds, and show that the models in an isomorphism class arise as 
equivalent representations of a single world. And if one rises to this challenge, 
a definition of determinism in terms of worlds, that is very different from Dm1 
and Dm2, should be forthcoming. (A parallel challenge faces the position that 
answers Yes to (Same?), but does not ‘go further’: its equivalent represen- 
tations are isomorphic models with a common base-set.) 

Earman and Stachel, who both answer Yes to (Same?), have risen to the 
challenge. Earman has explored one approach to rewriting our theories 
without reference to spacetime points [1977, 1987]. And Stachel has explored 
another more conservative approach ([1985], $6). Stachel retains points, but 
eliminates the redundancy by incorporating the arbitrary cholce, of which 
point is the locus of a given physical point-event, into the representation of the 
event; thus each event is represented by a map sending points of the manifold 
to sets of geometric objects. I believe that both approaches may well be 
workable; but to save space, I shall concentrate on Earman's more radical 
approach. (I shall also ignore the approaches of Manders [1982] and Mundy 
[1983], which I believe unsatisfactory.) 

Since isomorphic models should turn out to be equivalent representations, 
using spacetime points, of a single physically possible world, Earman has to 
recover differential manifolds as representations of another notion. He 
suggests that this might be done satisfactorily, using Geroch's [1972] idea of 
an Einstein algebra. Geroch shows that the construction of geometric objects 
(first, directional derivatives and so on up) on a differential manifold can be 
done with only an initial reference to the points; thereafter one can refer 
always to the algebra of smooth scalar flelds on the manifold. Diffeomorphic 


The Hole Truth I5 


manifolds will have isomorphic algebras. Earman therefore suggests that we 
take as our fundamental object the algebra, in abstraction from its represen- 
tation as smooth scalar flelds on a manifold. These representations will be non- 
unique, determined only upto isomorphism of spacetime models < M,O, » , just 
as one would want. Determinism is then to be formulated in terms of a 
subalgebra determining the algebra; and thus formulated, determinism will be 
saved from the threat of the hole argument. 

This approach may be workable. Let me say in its defence that it does not 
require one (that is, a scientific realist) to believe that the world is an abstract 
algebra! One takes the elements of the algebra as non-abstract, as physical 
entities that can be thought of as scalar fields, once one picks a manifold so as 
to give a representation. Thus the proposal posits a collection of what-nots that 
are like scalar fields; and posits a very richly structured collection of such— 
recall that in a representation with a manifold, for every region, no matter how 
small or gerrymandered, there is a scalar field with that region as support. The 
proposal is thus a sophisticated relative of the mereological constructions of 
points, associated with authors like Whitehead and Tarski; they construct 
points satisfying e.g. Euclidean geometry, in terms of a richly structured 
collection of extended regions of space, e.g. the spheres with all radii and all 
centres. I think that a scientific realist should by and large be sceptical of such 
constructions: trading in all the points for all the regions, or all the spheres, 
seems to me to be no great gain—there is no greater warrant for believing in 
such a richly structured collection of regions, than for believing in all the 
points. But I agree that Earman's approach has an important difference: the 
threat suggests that belleving in points rules out determinism, so that the trade 
is worthwhile in this case, even if the familiar mereological trade is not. 

However, this approach certainly clashes with the practice of the textbooks, 
with their explicit and frequent quantification over points. I submit that this 
clash is sharper than the clash between the textbooks answering Yes to 
(Same?) and substantivalism having to answer No. (I would say the same 
about Stachel's clash with the practice of the textbooks, though it is smaller 
than Earman's.) 

What if we answer No to (Same?)? Is Dm1 or Dm2 to be preferred? The 
situation is a little complicated, since answering No yields two options. (Each): 
Each of the two models related by a hole diffeomorphism represents a world; 
and (One): At most one of the models represents a world. (Each) implies that 
the basic idea of determinism is ruled out by any theory which has such pairs of 
models. Since Dm1 is violated by such pairs of models, and Dm2 is not, an 
advocate of (Each) will prefer Dm1 over Dm2. But of course I reject (Each): 
substantivalism must answer No, and should not rule out determinism at a 
stroke. So it must endorse (One). 

On option (One), the merits of Dm1 and Dm2 are as on the Yes answer to 
(Same?). (One) means that at most one world is at issue, so that models related 


16 Jeremy Butterfield 


by a hole diffeomorphism should not threaten determinism; but Dm1 is 
violated by such models; so Dm2 is to be preferred. As before, I have no brief to 
defend Dm1; and so do not regard this result as an argument against (One). 

We can at last attack the question: can substantivalism justify (One)? I think 
there are two ways to do so: essentialism and denying transworld identity. I 
shall argue that the latter has advantages quite independent of determinism; 
and that it can accommodate Dm2 more easily than essentialism. 


5 ESSENTIALISM 


The basic idea of essentialism is to justify (One) by individuating the points by 
some of the fields. Thus essentialism holds that some models fail to represent a 
world, because they are not faithful to the essential properties and relations of 
some of their points: such models 'paint' points with properties or relations 
that are impossible for them, like ‘being a poached egg’ is impossible for Hubert 
Humphrey. In order justify (One), essentialism must rule out as ‘faithless’ all 
the models related by a hole diffeomorphism to a model that is given as faithful 
(as representing a world). So it must attribute to points a sufficiently rich 
collection of essential properties and relations: rich enough that a hole 
diffeomorphism applied to a faithful model gives a faithless one. Now, since a 
hole diffeomorphism 'moves' all the geometric objects, i.e. shuffles all those 
properties and relations of points in the hole, that are coded by geometric 
objects, essentialism can get a rich enough collection by choosing one or more 
geometric objects. The properties and relations coded by any such object are 
shuffled by a hole diffeomorphism; so assuming the given distribution was 
essential, the new distribution is impossible and the new model faithless. 

A particular version of essentialism has already been advocated: Maudlin 
[1988] has advocated essentialism about the metric fleld and connection. That 
is, he is a substantivalist who replies to Earman and Norton’s threat by 
claiming that points have their metrical properties and relations (as coded by 
the metric field and connection) essentially. Thus ifr is in fact distance 1 from p, 
as in Figure 1’s first model, then it could not be some other distance—and 
Figure 1’s second model does not represent a genuinely possible world. I shall 
discuss the advantages, and then the disadvantages, of metrical essentialism; 
we shall see that the situation for other versions of essentialism is analogous. 

Maudlin defends metrical essentialism in two ways. First, he urges that it is 
plausible, quite apart from offering an escape to Earman and Norton's threat. 
He points out that there are strong hints of metrical essentialism in Newton, 
and in Einstein’s emphasis, in his later discussions of the hole argument, on 
points being distinguished only by reference to the metric field; (cf. Maudlin 
[1988], pp. 29-30, 41-2; Norton [1987]: §6.) He also points out that Earman 
and Norton’s use of diffeomorphisms in posing their threat means that they 
concede to the substantivalist the right to be essentialist about points’ 


So 


The Hole Truth 17 


topological and differential properties and relations ([1988], p. 31). For 
diffeomorphisms preserve these; and if one took them to be accidental rather 
than essential, one could avoid all the technicalities of diffeomorphisms and 
instead simply use an arbitrary permutation of the points to produce a new 
possibility. As before the permutation would be the identity map outside the 
hole, and differ from identity within the hole. Thus Maudlin urges that 
substantivalists should take more properties and relations to be essential than 
Earman and Norton concede: metrical, as well as topological and differential 
ones. And this makes it clear that Maudlin’s essentialism is modest: the set of 
metrical properties is about the smallest set of properties, including topological 
and differential ones, whose being essential enables one to escape from Earman 
and Norton's threat (cf. his fn. 34). 

Secondly, Maudlin addresses arguments against metrical essentialism, that 
are implicit in Earman and Norton. Thus Earman and Norton make the point 
at the start of Section 3 above: that active general covariance, (GC), involves 
treating metric structure on a par with matter fields, and they go on to give 
reasons for such a treatment. I think that as regards all but one of these 
reasons, Maudlin succeeds in showing that the reason fails if one is an 
essentialist ([1988], pp. 32-5; 38-40). I shall spell out one of Maudlin's 
successful replies: namely, his reply to what Earman and Norton call the 'acid 
test of substantivalism, drawn from Leibniz. This reply will illustrate 
essentialism (though I shall express it in my terminology, and will expand it 
somewhat). 

Earman and Norton consider Leibniz's question: if everything in the world 
were translated three feet East, would we have a different world? They take it 
that substantivalists must answer Yes to this; and that their hole diffeomor- 
phism is the counterpart of Leibniz’s translation, so that substantivalists must 
deny ‘Leibniz equivalence: diffeomorphic models represent the same physical 
situation’ ([1987], p. 11). And they take this to mean that substantivalists 
must choose option (Each): (cf. their claim (b) of Section 2). 

The basic idea of Maudlin’s reply is to deny that the hole diffeomorphism is 
the counterpart of Leibniz's translation. He points out that we should 
distinguish two methods of producing one model from another by having a 
diffeomorphism drag along geometric objects: (i) It drags along all geometric 
objects; and (ii) it drags along only the matter fields, and is not applied to the 
metrics and connection. Thus by method (i), we produce «M,d*O;— from 
<M,0;>, just as In Section 2. But by method (ii), writing the given model as 
<M,g,D,0,>, g the metric or metrics, D the connection, O, the matter fields, we 
produce « M,g,D,d*O,7 . Maudlin then claims that while Earman and Norton's 
hole diffeomorphisms are examples of (i), Leibniz had in mind method (it). 
Furthermore, in method (i), the spatiotemporal properties and relations of 
points are in general altered so that a metrical essentialist can rule the 
produced model to be faithless and to not represent a possible world; but in 


18 Jeremy Butterfield 


method (ii), only the properties and relations coded by matter fields are altered, 
and the metrical essentialist can perfectly well answer Yes to Lelbniz's 
question—i.e. say that the produced model represents another possible world. 

Ishould add three details to this reply. (1) The distinction between methods 
(i) and (li) is obscured by the example of spatial translation in a classical (or 
Minkowski) spacetime. For in these spacetimes, spatial translation is a metric 
symmetry, so it drags the metrics and connection into coincidence with 
themselves. But that is no trouble for Maudlin: he will hold that since the 
essential metrical properties are preserved, the produced model represents 
another possible world. 

(2) The claim that for method (ii), the substantivalist can and should answer 
Yes to Leibniz's question-—we would have a different world—needs slight 
qualification, if as usual matter is represented by flelds. For method (it) does not 
always give some points different properties and relations. Usually it does; and 
the substantivalist then has every right to say we get a different world. But 
there will be exceptions, when the dragging along of matter fields produces a 
model that is identical with the given model. That is to say, in some cases the 
diffeomorphism drags each matter field of the given model into coincidence 
with Itself. (Such a diffeomorphism is called a symmetry of the matter flelds; cf. 
the analogous notion of a metric symmetry, introduced in Section 3.) Thus 
take Leibniz’s example of translation three feet East, defined as a map on a 
classical (or Minkowski) spacetime: the exceptional models are those whose 
matter fields are periodic in the direction East, with a period that divides three 
feet—think of how dragging by three feet a scalar field, that has a period of 1j 
feet, drags the field into coincidence with itself. (Even this qualification falls 
away if we take matter to have an identity that is not expressed by equality of 
values of fields.) 

(3) The model produced by method (ii) is in general not only different from 
the given model; it also is not a model of the theory. That is, it does not satisfy 
the theory’s field equations. Here, Lelbniz's example is atypical: it is a special 
feature of classical and Minkowski spacetime that spatial translation of a 
solution will produce another solution. The reason lies in the fact that in these 
spacetimes, spatial translation is a metric symmetry; and in the requirement, 
mentioned on p. 9-10, that a metric symmetry applied to a model of a theory 
should produce another model of it. 

I said there was one reason, that Earman and Norton give for their treatment 
of metric structure on a par with matter fields, that Maudlin does not 
successfully rebut. This reason concerns the possibility of non-isometric 
models (cf. p. 5, at the start of Section 3). They in effect make three points 
([1987], pp. 518-9, 522, fn. 2;'cf. also Rarman [1987], p. 19). First, they 
concede that if all the models of the substantivalist's theory are isometric, then 
the substantivalist can write his theory as postulating a single manifold plus 
metrics and connection, the same in every model; this will be a fixed canvas on 


The Hole Truth 19 


which matter fields get painted, and (GC) will fail. (As mentioned on p. 6, this is 
how classical and special relativistic theories are usually presented.) Secondly, 
they say that if such a theory has metric symmetries—as it usually does—then 
their threat can nevertheless be resurrected (p. 522, fn. 2). Thirdly, they point 
out that our current best theory, general relativity, has pairs of models that are 
non-isometric, indeed non-diffeomorphic. They conclude that if one’s substan- 
tivalism is based on scientific realism about general relativity, one cannot 
avoid (GC) and its concommittant treatment of metric fields on a par with 
matter. 

Maudlin in effect agrees with the first and third points (he would deny the 
second). That is, he admits that his essentialism cannot handle a theory with 
non-isometric models. For his essentialism concerns only the actual world’s 
points. He claims only that the actual points have their metrical properties and 
relations to one another essentially, so that a possible world containing the 
actual points must be isometric to the actual world. Accordingly, Maudlin 
briefly suggests that the substantivalist should handle a theory with non- 
isometric models, by denying transworld identity for the points concerned and 
instead using counterparts: a point in a world that is not isometric with the 
actual world is not identical with any actual point, but at best a counterpart of 
it ([1988], pp. 37-8). 

Non-isometric models thus prompt Maudlin to say that some possible points 
are not identical with any actual point. In Section 6, I will expand this 
suggestion arguing for a more radical denial of transworld identity: any point 
is an inhabitant of just one possible world. But before doing that, let us see 
whether a metrical essentialist can respond to Earman and Norton's argument 
about non-isometric models, while retaining transworld identity of points, Le. 
while having every possible point be an actual point. I shall argue that this 
position is implausible, thus lending some support to Section 6. 

Since this position takes every possible world to use the same population of 
points, viz. the actual points, it must relativize the essential properties of each 
point to an isometry class of models. Thus this position not only says, as 
Maudlin does, that within the isometry class of the actual world, the only 
models that represent possible worlds are those (1) whose manifold has as its 
base-set of points, the actual points; and (ii) which give each point the same 
metrical properties and relations to other points as it has in the actual world. 
This position makes a similar claim about any isometry class of models of the 
theory: for each class, it assumes there is some favoured way in which the 
metrical structure characteristic of the class is to be ‘painted’ on the actual 
points, and it rules as faithless all models that do not have this painting. That is, 
itsays that the only models of the class that represent possible worlds are those 
(i) whose manifold has as its base-set, the actual points; and (ii) which give 
each point the same metrical properties and relations as it has In the assumed 
favoured model. In other words, this position holds: given a metric structure 


20 Jeremy Butterfield 


permitted by one’s theory, there is a physically possible world with the actual 
points as its base-set, with the given structure painted on the actual points, and 
such that all models in the isometry class of this world, that do not have the 
same base-set and the same painting of metric structure on them, are faithless. 

This position may be technically workable: in particular, every model of 
every famillar theory has continuously many points (Geroch [1968], p. 1743), 
so that the actual points will be numerous enough to serve as the points of any 
world. But I reject it, for two reasons. First, it is strange to relativize essential 
metrical properties to an isometry class. For it amounts to saying that the 
metrical properties of an actual point can be otherwise than they actually are, 
but only if the point is embedded in a non-isometric structure; so a point can 
only alter its metrical properties if all the others also alter them radically 
enough to give a non-isometric structure—‘you can alter only if everyone 
alters radically'. 

Secondly, there is a problem of silence. The position entails that there are 
facts of the matter that it does not decide. That in itself does not refute a 
position; but I think there are too many such facts in this case. Thus, the 
position, and our other background beliefs, are silent about which is the 
favoured way to paint a metric structure, non-isometric with the actual world, 
onto the actual points. You might say that there is a natural choice for the 
favoured painting: it is to be the one that makes points' metrical properties as 
close as possible to their actual properties. 'Close' needs to be made precise; but 
that can probably be done, by choosing weightings for the various metrical 
properties. I reply: making it precise probably involves choices, e.g. of 
weightings, that we can only see as arbitrary—which is just my point: the 
position posits facts about which we have no beliefs. Furthermore, however 
you make 'close' precise, the problem of silence will arise again because of the 
existence of metric symmetries. Thus if a metric structure has a symmetry, 
there will be at least two ways to paint the structure on the actual points, 
which will be tied for first equal according to any precise criterion of closeness. 
For given one painting, I can apply the metric symmetry to define another 
equally close painting. (And one can't say it's vague what is the favoured 
painting: vagueness is semantic indecision and if a question is posable in a part 
of our language where no expressions are vague, it cannot have a vague 
answer. I owe this point to Lewis [1986], pp 212-13). 

So far I have discussed only essentialism about metrical properties: first 
Maudlin's version, and then a version that relativizes essential properties to an 
isometry class. But we can now see that the problems facing essentialism— 
Maudlin's problem with models non-isometric with the actual world, and the 
new version’s problems of implausibility and silence—will carry over to other 
choices of essential properties. Thus suppose we choose some of the geometric 
objects, Ox say, as coding points’ essential properties. Then we can define an Oy- 
isomorphism, on analogy with an isometry: a diffeomorphism that drags the 


5S 


The Hole Truth 2I 


Ox from one model into coincidence with the corresponding objects O,' in 
another. In general, a theory will have models that are not O,-isomorphic with 
the actual world; and this will give an analogue to Maudlin's problem. If we 
want to respond to this problem, while making the actual points serve as the 
points of any world, we must relativize the essential properties to O,- 
isomorphism classes, and for each such class assume a favoured painting of the 
O,-structure on the actual points. This will give problems of implausibility and 
silence again. (Agreed, there is one consequence of including matter flelds in 
the O, that code essential properties, that many (including perhaps Einstein) 
have found attractive. Namely, including some matter fields in the Ox implies 
saying No to Leibniz's question, i.e. saying that the model produced by spatial 
translation does not represent another possible world, but is faithless. For the 
translation alters some points' properties that are coded by the chosen matter 
fields (setting aside symmetries of these flelds); but these properties are just as 
essential as metrical ones.) 

One choice for the Ox deserves special mention: the choice of all the 
geometric objects O,. This choice faces the problems above, if anything more 
severely than other versions. For the O,-isomorphisms are now just the 
isomorphisms, and the isomorphism classes are smaller than the Ox- 
isomorphism classes given by other choices of Oy. So there are many 
isomorphism classes: and our theory is bound to have models not isomorphic 
with the actual world, giving Maudlin's problem; and if we respond to this 
problem by relativizing, we have many values of the relativization parameter. 

So much by way of presenting the problems facing essentialism. Those 
problems make one tempted by haecceitism (‘Los Angeles’). I take this to be the 
proposal that there can be 'bare identity' of points between possible worlds, 
irrespective of how they are painted: more precisely, there are facts of the 
matter about which points are in which physically possible worlds, although 
these facts cannot be stated in any condition involving properties of points, no 
matter how subtle or complex. (For some precision about how the conditions 
must be required to avoid the use of identity, see Adams [1979], pp. 6-9.) 
However one construes possible worlds and ‘transworld identity’, this proposal 
is tempting. For it promises to solve essentialism's problem of silence about 
which way to paint the actual points with a non-actual distribution of 
properties. In effect haecceitism makes a virtue of stlence: it holds that there are 
no rules—points are flexible and can be the same, painted with what properties 
you like. 

However, haecceitism surely cannot reconcile substantivalism and determi- 
nism. For substantivalism must answer No to (Same?), and haecceitism must 
surely endorse (Each)—at least for some pairs of models; and thus the basic 
idea of determinism will be ruled out for (GC)-theorles. (And as a consequence, 
Dm1 will be preferred to Dm2 as an explication of determinism.) 

A haecceitist may reply: ‘I agree that I have to endorse (Each). But I am 


0615? 


22 Jeremy Butterfleld 


perfectly willing to rule out the basic idea of determinism, with its sentitivity to 
the ‘underlying identity’ of points. For I am content to save the possibility of 
determinism, in some weaker sense that disregards the identity of points. Two 
such senses are made precise, in terms of models, by Dm1 and Dm2. Since the 
latter can hold for (GC)-theorles, I prefer it to the former. Thus I am content to 
save the possibility of Dm2 for such theories. And note that my account 
meshes quite well with the practice of the physics textbooks. Admittedly, I 
answer No to (Same?), while they answer Yes (p. 13); but I quantify over 
points, am a substantivalist, and save a textbook definition of determinism. 
What more can you want of a No answer to (Same?)?' 

I say: 'I want to save determinism as a matter of worlds, not mere models. 
And denying transworld identity shows how to get what I want.’ 

(Even if one is willing to rule out determinism, there are other problems with 
haecceitism. For recent general argument against 'transworld identity', 
especially ‘bare identity’, cf. Lewis ([1986], Chapter 4, especially Section 4). I 
shall only mention a problem specific to points. Haecceftism is apparently 
committed to two possible worlds differing only in the transposition of two 
points. More precisely, two models <M,0,;> and « M,O/' > represent different 
possible worlds even if there are points p, q e M, such that the map that is 
identity on M—{p} and transposes p and q is an isomorphism of the models. 
That seems to be a distinction without a difference.) 


6 DENYING TRANSWORLD IDENTITY 


What we seek is a non-essentialist justification for answering No to (Same?) 
and endorsing (One). And more specifically, for taking Dm2 as an explication 
of determinism as a matter of worlds—thus improving on haecceitism. Saying 
that no point is in two worlds will give us this. It clearly justifies (One): if one 
model in Figure 1 represents a world, then the other cannot since it contains 
the first model's points. And we shall see that in the definition of determinism it 
justifies taking matching (of regions in the antecedent; of whole worlds in the 
consequent) as a matter—not of the identity of spacetime points and structure, 
but—of isomorphism as expressed in terms of diffeomorphisms. 

Lewis is the great denier of transworld identity: that is, he holds that no 
object occurs in any two worlds. He of course accepts that possible worlds 
provide the truth-conditions of modal discourse. So he offers counterpart 
theory for treating de re modal sentences, e.g. 'Hubert Humphrey might have 
won the election': in some world there is a counterpart of Hubert Humphrey 
who wins. That is, it is this counterpart's winning that makes true the 
sentence, Le. constitutes the actual Humphrey's modal property. Counterparts 
are picked out by similarity. What properties make for similarity varies from 
case to case, depending on the meaning of the sentence, so that there are many 
counterpart relations. In general, counterparts need not be exactly similar in 


The Hole Truth 23 


any respect, and often the relevant respects are mostly extrinsic to the objects. 
In particular, matching origins is often relevant: ‘Had my early years gone 
differently, I might have been a carpenter’ is made true by an other-worldly 
person very similar to me in origin and suitably different in later life. Also, what 
counts as a relevant property is often vague, since the meaning of de re modal 
sentences is often vague. Note also that a world can contain more than one 
counterpart of a given object; this makes it easy to make sense of ‘I might have 
been twins’. 

This is not the place for an account or defence of counterpart theory—so 
ably given by its inventor (Lewis [1968]; [1973], pp. 39-43; [1986], Chapter 
4). Let me only join Lewis ([1986], pp. 194-97) in two points. First, in 
rebutting the idea that counterpart theory cannot accommodate the Intuition 
that it is Humphrey himself, our Humphrey, who might have won. Counter- 
part theory says: indeed, it is. The question is how another world can represent 
de re, concerning our Humphrey, that he wins. Secondly, Lewis goes on to say 
that the Intuition that our Humphrey must be part of the other world sits ill 
with taking possible worlds as ersatz objects; for example, if a possible world is 
some kind of maximal consistent set of sentences, then Humphrey is no part of 
a possible world. I agree and conclude that ersatzers had better endorse 
counterpart theory. Lewis goes on to argue that modal realists should also do 
so: Ibid. p. 168 f. It follows that whatever view you take on the debate about 
Lewis’ realism, counterpart theory is plausible if not compelling. 

I propose that we deny transworld identity to points: any point is a part of 
just one possible world. (It is of course a set-theoretic constituent (member, or 
member of a member, or. . .) of many base-sets, and so many manifolds, and so 
many models.) Similarly for mereological fusions of points, i.e. spacetime 
regions. This will clearly secure (One). It remains to apply counterpart theory 
to points and regions. This will involve a discussion of determinism, and lead 
finally to an advantage of counterparts over essentialism, independent of 
determinism. 

The application of counterpart theory is straightforward. The main 
difference from the general situation will be that stmilarity being vague and 
often extrinsic does not arise for us. A precise notion of similarity (intrinsic for 
regions, largely extrinsic for points) is captured by the idea of isomorphism for 
spacetime regions. Indeed, isomorphism captures, for spacetime theories, 
Lewis’ recent idea of duplication. Lewis advocates this idea as one application 
(among many) of a distinction among properties. Lewis construes properties in 
terms of his ontology of possible worlds and possible objects, each confined to 
its world: any class of possible objects is a property. Intuitively, the class is the 
property’s extension across all the worlds. He distinguishes an élite minority of 
natural properties, whose sharing makes for resemblance, which are relevant 
to the causal powers of objects, and which it is the business of science to 
discover; the countless throng of other properties are unnatural. (He similarly 


24 Jeremy Butterfield 


distinguishes natural and unnatural relations.) This distinction is contentious; 
in particular, Lewis holds that a property is natural or not, once and for all— 
not relative to a world or a theory. He urges the distinction by appeal to its 
beneficial consequences: a single distinctton turns out to have many useful 
applications [1983]. 

I do not need Lewis' distinction. But it is worth describing how he uses It to 
analyse the idea of duplication and thereby to offer a general definition of 
determinism in terms of worlds. For although his definition Is not intended for 
spacetime theorles, it is remarkably like Dm2 which I extracted ([1987], pp. 
17-29, 26-9) from Hawking and Ellis: a happy agreement. And his definition 
will bear upon counterpart theory for points. 

We are all familiar with approximate duplication, e.g. when we make a 
xerox. And we have the idea of a more perfect duplication, e.g. in temperature 
or chemical composition. Indeed, physics teaches us that individual atoms of 
the same isotope or elementary particles of the same kind (in the same 
quantum state) are perfect duplicates. But countless properties would not be 
shared by perfect duplicates: we have words for some of them—perhaps our 
two atoms have different owners. Thus it seems that duplication is a matter of 
shared intrinsic properties. But how to analyse 'intrinsic'? Lewis urges that 
although some intrinsic properties are not natural (e.g. ‘cubical or charged or 
liquid’—intrinsic since its disjuncts are), all natural properties are intrinsic. So 
duplication is analysed: any two objects are duplicates iff (1) they have the 
same natural properties; and (2) their parts can be put into correspondence in 
such a way that corresponding parts have the same natural properties and 
stand In the same natural relations ((1983], pp. 355-8; [1986], pp. 61-3). 
(Maybe (2) is redundant; it depends on whether some properties concerning 
how a thing is composed are natural.) 

Lewis then deflnes determinism (of the future by the past) as follows. He says 
that two worlds diverge iff some Initial segment of one is a duplicate of some 
initlal segment of the other, but the whole of the one is not a duplicate of the 
whole of the other. And a theory is deterministic iff no two divergent worlds 
both make it true (conform perfectly to it) ([1983], pp. 359-60). 

Lewis' definition of determinism is remarkably like Dm2. Indeed, in 
accordance with substantivalism and our physicalist limitation to the 
properties and relations coded by geometric object, let us assume (i) worlds 
have manifolds M, M’, . . . , equipped with geometric objects Oi Oj',. . . ; and (ii) 
the properties and relations coded by these objects are the natural ones. Then 
Lewts' definition of duplication will mean that spacetime regions are duplicates 
iff they are isomorphic: Lewis’ existential quantification over correspondences 
of the parts becomes quantification over diffeomorphisms.) We can now make 
Lewis’ definition equivalent to Dm2, by generalizing from determination by 
the past to determination by a region S. Thus let us say that two worlds diverge 
off S iff: (1) they both contain regions S, S’ of kind S; and (2) there is a 
diffeomorphism a:S —S' with a*(O,) =O,’ on a(S) 2S"; and (3) there is no global 


M 


The Hole Truth 25 


isomorphism f:M —M' with fi*(0)) 2O/ and f(S) 2S'. (Notice that in (3), f is 
not required to extend a; so (3) is a strong denial, and divergence off S is a 
strong notion.) And let us say that a theory is S-deterministic if no two worlds, 
diverging off S, both make it true. This is plainly equivalent to Dm2. 

Lewis points out that his definition of determinism does not require the 
denial of transworld identity. Indeed not: it, like Dm2, is neutral on transworld 
identity. But to secure (One), I want to deny such identity. So I have a choice of 
terminology. I can use 'counterpart'; and so connote no transworld identity 
(correct), and similarity being vague (Incorrect) and extrinsic (Incorrect for 
regions). Or I can use ‘duplicate’; and so connote commitment to Lewis’ 
distinction (incorrect), neutrality on transworld identity (incorrect), and 
similarity being precise and intrinsic (correct). I prefer the first option. 

So I want to define counterparthood for points and regions in terms of 
isomorphism of regions. Recall first that although we can directly compare the 
values of scalar fields at two points in two manifolds, we cannot do so for 
vectors and tensors. Such a comparison has to be made relative to a 
diffeomorphism of the manifolds, or a class of them: each diffeomorphism d 
drags geometric objects at p to geometric objects at d(p). If the dragged objects 
at d(p) coincide with the originals, we can say that p and d(p) are counterparts 
relative to the diffeomorphism d. If we have coincidence of this kind on a class 
of diffeomorphisms that agree in their value for p, we can say that p and this 
value are counterparts relative to the class. The counterpart relation will be 
more stringent the wider the class of diffeomorphisms. Similarly, we can say 
that two regions are counterparts relative to a diffeomorphism (or a class of 
them) under which the regions are isomorphic. 

Thus for worlds to be isomorphic under d, is for each point p in the first world 
and its image d(p) to be counterparts relative to d. And for worlds to match on a 
region S (S a region in one world) by the diffeomorphism d, in the sense of the 
definitions Dm1 and Dm2, is for S and d(S) to be counterparts relative to d; 
equivalently, for the points in S to be counterparts under d of their images d(p). 
So we can now spell out Dm1 and Dm2 in terms of counterparts; I shall speak 
of points’ counterparts—the gloss in terms of regions is similar. Dm1 says that 
any global mode of comparison relative to which the points in a region of kind 
S are counterparts of their images, is one relative to which any point is a 
counterpart of its image. Similarly, Dm2 says that if a local mode of 
comparison 1s one relative to which the points in a region of kind S are 
counterparts of their images, then there is a global mode of comparison relative 
to which any point is a counterpart of its image. 

It is now clear how counterpart theory justifies taking the matching of 
worlds as a matter of isomorphism, without regard to the underlying identity 
of points. No point occurs in two worlds, so isomorphism is all that matching 
can mean. 

We can also see why counterpart theory can accommodate Dm2 more 
easily than essentialism—or any doctrine of transworld identity—can. If we 


26 Jeremy Butterfield 


believe that a point can occur in two worlds, we are bound to think that 
matching of worlds on a region can be given a strong sense: the very same 
points with the very same properties and relations to each other. And if we are 
given a pair of worlds that match on a region in this strong sense, then we are 
bound to think that the only sense of global matching that can be relevant to 
judging whether determinism is upheld or violated by this pair is a sense that 
extends the given match—and thus respects the identity of the points of the 
region. That is, the global diffeomorphism between the worlds must extend the 
identity map on the region. Dm1 incorporates this requirement; Dm2 does 
not—and we saw (p. 9) that it cannot do so, on pain of being violated by hole 
diffeomorphs, and so ruling all (GC}theorles indeterministic. Thus essentia- 
lism has a problem accommodating Dm2. On the other hand, counterpart 
theory has no such problem: ff a point cannot occur in another world, but can 
only have counterparts there, the counterpart provided by the given match on 
a region has no special privileges over other counterparts. 

. To sum up, I hold that counterpart theory provides the best justification for 
answering No to (Same?) and endorsing (One)—thus reconciling substantiva- 
lism and the possibility of determinism for (GC)- theories. It saves determinism 
as a matter of worlds. And it readily accommodates the definition Dm2, which 
is needed for (GC)- theories to be deterministic. 

An objection: You said that the basic idea of determinism is that a single 
physically possible world is specified by the facts on a certain region of 
spacetime. But your counterpart theory makes determinism something else: a 
matter of global similarity of worlds, under a certain mode of comparison, 
being induced by similarity of regions. Moreover, these modes of comparison 
can be chosen very freely: any diffeomorphism will provide one. 

I reply: I agree that this is a disadvantage of my proposal. But I think it is a 
small one. Because as mentioned above, counterpart theory for all kinds of 
objects is plausible, And counterpart theory together with substantivalism 
imply that the basic idea above is automatically true, in a trivial way. For ifno 
object inhabits two worlds, then picking out just one object suffices to specify a 
possible world. And for a substantivalist, the spacetime points are among the 
objects, and any physical fact on a region of spacetime involves picking out a 
spacetime point. Thus the basic idea of determinism Is trivially true, whatever 
the details of one’s spacetime theory. Thus if determinism is to be non-trivial, it 
should be formulated in other terms: comparison and matching are the 
obvious terms to use. 

You might object: ‘But you have been so fearful of automatically ruling out 
determinism at a stroke; so why are you not equally fearful of ruling it in at a 
stroke?’ Again, I agree that there is a disadvantage. But recall the word of 
warning at the start of Section 4 (p. 11)—that in assessing definitions of 
determinism, their explicating the basic idea is not the only merit to be 
considered. To be sure, it is one merit. That is why Section 4 argued that 
although Dm2 is not violated by hole diffeomorphs, we should try to justify 


The Hole Truth 27 


(One). We have seen that there are two possible justifications: essentialism, 
which 1s implausible, and denying transworld identity. Thus the search to 
justify (One) ends by ruling in the basic idea. Agreed, that ts a disadvantage. 
But the main point ts that although the basic idea is ruled in, there is a precise 
definition which is not ruled in (nor ruled out), and which—as we have seen— 
meshes well with the denial of transworld identity. In short, I am content to 
rule the basic idea of determinism in or out, if this seems best overall and 
provided that a corresponding precise definition used by the textbooks is not 
also ruled in or out. And Dm2 satisfies this. 

Can we define a counterpart relation, that is not relative to a diffeomorphism 
or a class of them? I think not—there must be a mode of comparison, and that 
requires a diffeomorphism. But sometimes there is a natural class, so that one 
intuitively suppresses it. The obvious case is the class of all isomorphisms 
between two worlds: it is natural to say that a point is the counterpart of all its 
images under the various isomorphisms; and again, a similar remark applies to 
regions. 

This brings out a final point: an advantage of counterpart theory 
Independent of the issue of determinism. It relates to the remark that a world 
can contain more than one counterpart of a given object ('I might have been 
twins', p. 22). Thus using the class of all isomorphisms, a suitably homo- 
geneous world will in general contain many points that are counterparts of a 
given point in another world; and similarly for regions. In extreme cases, there 
are pairs of worlds such that every point in one world is a counterpart of every 
point in the other: e.g. a pair of empty constant-curvature worlds with 
diffeomorphic global topology—such as empty flat classical spacetime; again, 
similarly for regions. (Furthermore, symmetries of a world can be described in 
terms of Lewis’ recent idea that an object can be the counterpart of another 
object in the same world; see Lewis 1983a). This many-many counterpart 
relation seems to me far preferable to essentialism. For it accommodates the 
Intuition that in Leibniz's thought-experiment about translating the material 
contents of the universe three feet East, we must also identify the points 
according to the matter that inhabits them, so that the translation does not 
produce another world. It accommodates this intuition by the fact that the 
counterpart relation ‘tracks’ the fields. On the other hand, it does not face the 
three problems of essentlalism, described in Section 5—one of which prompts 


1 
counterparts anyway. University of Cambridge 


1 This paper owes a great deal to the writings of John Earman. For comments on previous 
versions, I am very grateful to audiences at Cambridge, Dubrovnik, Toronto, and Western 
Ontario; and especially to Tian Yu Cao, John Earman, David Lewis, John Norton, Michael 
Redhead, Paul Teller and Roberto Torretti. My discussion of essentlalism has profited greatly 
from Maudlin’s [1988]. 


REFERENCES 
ADAMS, R. [1979]: ‘Primitive Thisness and Primitive Identity’, Journal of Philosophy, 76, 
pp. 5-26. 


28 Jeremy Butterfield 


BUTTERFIELD, J. [1984]: ‘Relationism and Possible Worlds’, British Journal for the 
Philosophy of Science, 35, pp. 101-13. 

BUTTERFIELD, J. [1987): ‘Substantivalism and Determinism’, International Studies in the 
Philosophy of Sctence, 2, pp. 10-32. 

EARMAN, J. [1977]: 'Leibizean spacetimes and Leibnizean Algebras’, in Butts and 
Hintikka (eds.), Historical and Philosophical Dimensions of Logic, Methodology and 
Philosophy of Sclence. Dordrecht: Reidel. 

EARMAN, J. [1986): A Primer on Determinism. Dordrecht: Reidel. 

EARMAN, J. [1987]: ‘Why spacetime is not a substance’, preprint. 

EARMAN, J. and GLYMOUR C. [1978]: ‘Lost in the Tensors: Einstein's struggles with 
covariance principles 1912-1916’, Studies in the History and Philosophy of Science, 
9, pp. 251-78. 

EARMAN, J. and No&TON]. [1987]: ‘What Price Substantivalism? The Hole Story’, British 
Journal for the Philosophy of Science, 38, ppp. 515-25. 

GzROCR, R. [1968]: ‘Spinor Structure of Spacetimes', Journal of Mathematical Physics, 9, 
pp. 1739-44. 

Geroch, R. [1972]: ‘Einstein Algebras’, Communications in Mathematical Physics, 26, pp. 
271-5. 

HAWKING, S. and Exuis G. [1973]: The Large Scale Structure of Spacetime. Cambridge: 
Cambridge University Press. 

Lewis, D. [1968]: ‘Counterpart Theory & Quantified Modal Logic’, Journal of Philosophy, 
65, pp. 113-26. 

Lewis, D. [1973]: Counterfactuals. Oxford: Blackwell. 

Lewis, D. [1983]: ‘New Work for a Theory of Universalis Australasian Journal of 
Philosophy, 61, pp. 343-77. 

Lewis, D. [1983a]: ‘Individuation by Acquaintance & by Stipulation’, Philosophical 
Review, 92, pp. 3-32. 

Lewis, D. [1986]: On the Plurality of Worlds. Oxford: Blackwell. 

MANDERS, K. [1982]: ‘On the Space-Time Ontology of Physical Theories’, Philosophy of 
Science, 49, pp. 575-90. 

MAUDLIN, T. [1988]: Substances and Spacetime, forthcoming in Synthese. 

MONTAGUE, R. [1974]: Deterministic Theories, in his Formal Philosophy. Newhaven: 
Yale University Press. 

Monpy, B. [1983]: ‘Relational Theories of Euclidean Space and Minkowski Spacetime’, 
Philosophy of Sctence, 50, pp. 205-26. 

Norton, J. [1984]: ‘How Einstein found his fleld equations’, Historical Studies in the 
Physical Sctences, 14, pp. 253-316. 

NORTON, J. [1987]: ‘Einstein, the Hole Argument and the Reality of Space’, in J. Forge 
(ed.), Measurement, Realism and Objectivity. Dordrecht: Reidel. 

Sacus and Wu [1977]: General Relativity for Mathematicians. London: Springer. 

STACHEL, J. [1980]: Einstein's Search for General Covariance, paper read at 9th 
International GRG Conference, Jena. 

STACHEL, J. [1985]: ‘What a Physicist can Learn from the Discovery of General 
Relativity’, Proceedings, 4th Marcel Grossmann Meeting on General Relativity. Rome. 

Stem, H. [1977]: ‘Some Philosophical Pre-History of General Relativity’, in J. Earman et 
al (eds.), Foundations of Spacetime Theories. Minnesota: University Press. 

ToRRETTI, R. [1983] Relativity & Geometry. London: Pergamon. 

WALD, R. [1984]: General Relativity. Chicago: University Press. 


“of 


Brit. J. Phil. Sci. 40 (1989), 29-37 Printed in Great Britain 


Primary Qualities are Secondary 
Qualities Too 


GRAHAM PRIEST 


1 Introduction: Realism and Quantum Mechanics 
2 The Collapse of the Wave Packet 

3 Primary and Secondary Qualities 

4 What's the Matter Now? 

5 Realism and Non-locality 

6 An Objection 

7 The Hierarchy of Matter 

8 Conclusion 


I INTRODUCTION: REALISM AND QUANTUM MECHANICS 


Quantum theory appears to be a gift from the gods for philosophers of an anti- 
realist persuasion. It is a well established and very successful theory which 
appears to defy a realistic interpretation. I shall argue in this paper that a 
realistic interpretation of quantum mechanics is possible, though not without 
some change in our conception of the nature of the real. I take it to be very 
important that this is possible; for, ultimately, realism is the only satisfactory 
view. I shall not argue this here. I say it merely by way of putting my cards on 
the table. 


2 THE COLLAPSE OF THE WAVE PACKET 


Let us start with the problems of giving quantum mechanics a realistic 
interpretation. These are all aspects of the one fundamental problem of the 
collapse of the wave packet; so let me describe this. 

Let us suppose that we have some system, S, which can be In a range of 
macro-states, X. We might think of S as an electron and a member of X as the 
state of occupying a certain point in Euclidean 3-space. (More picturesquely, 
one might think of S as Schroedinger's cat, and X as the pair « dead, alive >.) It 
will be useful to have a name for the sorts ofthings that can bein X. For reasons 
that will become clearer later, I will call them Newtonian properties. The 
situation S is in at any time is given by a state function, y. Mathematically 


30 Graham Priest 


speaking, can be conceived of in several ways; but a standard way of 
thinking ofit is as a vector in a certain vector space over the complex numbers. 
The state function, although in some sense specifying what is there, does not 
tell us what an observer will ‘see’ if they look at the system, that is, which 
Newtonian state S will appear to be in if this ts experimentally determined. For 
example, it does not determine where the electron will appear if someone 
measures Its location. At best, yy determines, via certain vector operations, the 
probability of S being observed to be in a particular state in X.1 Yet when S is 
observed, it is always found to be in exactly one state in X. 

One should not, incidentally, read too much into the word ‘observer’. 
Although it is sometimes suggested that ‘the observer’ in quantum mechanics 
must be conscious, this is, at least as far as I can see, quite unnecessary; the 
observer might be a person, a camera, a screen, or any other interacting 
system. Now, to return to the main point, according to the orthodox, 
Copenhagen, interpretation of quantum mechanics, S Is in no determinate 
Newtonian state before an observation, but is precipitated Into one by It. This is 
the so called collapse of the wave packet. But to a realist it seems absurd to 
suppose that the mere observation of the system somehow forces it into the 
state it is observed to be in. The supposition that the system has the property of 
being in that state only by being observed is, indeed, a form of idealism. To be 
sure, the observation, by causally interacting with S will force it into a new 
state, V/, subsequently; but if S is observed to be in a certain state at a certain 
time then, presumably, the state 1s determinate prior to observation. 

To bring this home is the point of the famous Einstein/Podolsky/Rosen 
thought experiment. One way of looking at this is as follows: A pair of particles, 
a and b, is produced in a singlet state. This means that they have correlated 
spins (or some other property). When the particles are a long way apart we 
determine the spin of a, and hence, the spin of b. Now at what point did b 
obtain this spin? Not when it was observed. (We can even ensure that no 
observation is made to determine b’s spin by measuring a complementary 
property.) Nor can it be when a’s spin was measured—because a and b are a 
long way apart when this happens. It must, therefore, have had it all along. 

Thus, to a realist, it will appear that the state description, v, is a seriously 
incomplete description. We have therefore seen the development of ‘hidden 
variable’ theories in quantum mechanics, which try to eliminate the 
incompleteness by the addition of another parameter. These have not, 
perhaps, been notably successful; but at least until relatively recently it was 
possible to hope that something like this could be made to work. This is now a 
very dubious hope. By one of the major ironies of the subject, the Einstetn/ 
Podolsky/Rosen thought experiment was transformed into an actually 
! Specifically, the probability of S being observed to be in state x is q* -q, where q is the coefficient of 


the eigenvector corresponding to x in the expansion of y along the basis provided by the class of 
all such eigenvectors. 


Primary Qualities are Secondary Qualities Too 31 


performable experiment by the ingenious Dr. Bell; and the experimental 
verdict has gone against realism. In Bell's experiment we take large number of 
pairs of, e.g., electrons in a singlet state, and measure complementary 
properties of each member of the pair. When we try to explain the observed 
results on the realist assumption that the measured Newtonian property— 
spin—is intrinsic, and assume locality—that there is no communication 
between the particles after they separate—we get the wrong result. Rather, the 
result is as quantum mechanics predicts.? 

Hence, it would seem that the system, S, is not determinately in any 
Newtonian state before it is observed to be so (unless locality is violated—a 
straw at which even a drowning man would think twice before clutching). 
‘Confusion to my enemies’ says the idealist. 


3 PRIMARY AND SECONDARY QUALITIES 


It is clear that the 20th century has seen profound changes tn physics. Part of 
the problem in getting to grips with their import is in obtaining a suitable sense 
of perspective for something that close. I will therefore leave the subject of 
quantum mechanics for a while and discuss briefly the only period in physics 
comparable in the profundity of its changes—the scientific revolution of the 
17th century. It seems to me that there are some important lessons to be learnt 
from this. In particular, I want to focus on the change in the conception of 
matter that occurred at this time. As a result of the work of Galileo, Descartes 
and others, the mechanistic conception of matter was formed. Matter is 
characterized primarily by its extension and its locatability in space and time. 
These (and a few other) Newtonian properties are its primary properties. It 
would have them even if there were no perceivers of matter. By contrast, a 
number of the properties of matter, that had been thought of as on a par with, 
or even as more important than, these before, became thought of as secondary. 
Principally, the colour, smell etc. of matter were not intrinsic to it, but were in 
some way relational, relating the matter observed and the observer. If there 
were no observers, then matter would not be coloured in the same way that it 
is extended. 

The distinction between primary and secondary qualities was laid out 
explicitly by Boyle and Locke. I shall not discuss it and its rationale at any 
length here. I note only that, first, the secondary properties of an object arise 
because of the interaction between the object and an observer—indeed, it is 
just their observer-relativity which marks them out; and secondly, this 
distinction, however, one fills in the exact details, is now well established: the 
colour something appears is dependent on the state of the sense organs of the 
perceiver, the context of perception etc. (Though, one should note, this does 


? For an excellent description of this, together with a case for idealism on the basis of it, see 
Mermin [1981]. 


32 Graham Priest 


not make secondary qualities subjective, at least in one sense: similar observers 
in similar situations still have the same perceptions.) 

Of course, matter can have an intrinsic dispositional property of producing 
that kind of perception in that kind of observer; and this is sometimes called a 
secondary property too. However, I shall use the term ‘secondary property’ 
solely as referring to the appearance. This disposition is therefore not a 
secondary property, but its (partial) cause. How one is to understand these 
disposittonal properties is another question. Historically, it was answered only 
with the appearance of the 19th century atomic theory of matter and wave 
theory of light. With the help of these, it could be shown that the dispositions 
were really aggregate primary properties of the micro-structure of matter. It 
then became clear that to say, e.g., that fundamental matter, an atom, ts 
coloured, is not just false, but is a category mistake. Thus did these theories 
bring to fruition the mechanistic conception of matter, a conception that 
changed profoundly our understanding of matter, its properties, and their 
relationship to the observer. 


4 WHAT'S THE MATTER NOW? 


With this historical situation fresh in our thoughts, let us now return to that in 
quantum mechanics; for there is a strong analogy between the two 
situations.? 

The scientific revolution produced a novel conception of matter according to 
which matter was radically ‘different from the way in which it had been 
conceived of previously. Much of how it had been conceived of before was 
consigned to the realm of appearances. It is perhaps easy to overlook the 
strangeness of the idea of matter that is essentially colourless, textureless, etc., 
to 17th century sensibilities. The fact that many of the old paradigm properties 
of matter became perceiver-dependent made it tempting to think that there is 
nothing to matter itself over and above what is perceived. Thus, the scientific 
revolution could seem to entail idealism. Indeed, Berkeley and Hume were 
tempted to the point of succumbing. However, once one becomes clear that it is 
a new notion of matter which is at issue, with different essential properties, it 
becomes clear that there is nothing in the new science that requires idealism; 
quite the contrary: matter is as real as it ever was. 

In a similar way, it seems to me, the situation in quantum mechanics should 
be seen as occasioning a revision of our conception of matter (that is, of 
physical reality). The fact that the conception of matter produced by the 17th 
century scientific revolution took several hundred years to come to fruition, 
should remind us that we need to exercise extreme caution in saying what 
> This analogy was first noted, as far as I am aware, by Sava Petrov. See, e.g., his [1985]. The 


conclusions he draws from this analogy are, however, somewhat different from those I shall 
draw. I am grateful to him for his comments on an earlier draft of this paper. 


Primary Qualities are Secondary Qualities Too 33 


form the new conception of matter will finally take; but the outlines, at least, 
seem fairly clear. Reality, whatever that is, is described completely, and 
without residue, by state functions such as y. Thus, these functions are to be 
interpreted realistically, not instrumentally. Reality, or matter, then, is the 
physical realization of a certain kind of vector. (Just as a force is the physical 
realization of another ktnd of vector.) Let us call these realizations y states. The 
‘shape’ of a state (in particular, the coefficients of its eigenstate expansions) ts 
an intrinsic property of that state. Such properties are observer-independent, 
and are analogous to the primary properties of the mechanistic conception. 

A Newtonian property such as having a certain spin (or, perhaps better 
now, appearing to have a certain spin) is not an intrinsic property according to 
this conception, but is observer-dependent. Such properties result from an 
observation acting on the y state, just as the mathematics, where the operator 
corresponding to the observation acts on the w function, has it. Thus, 
Newtonian properties are analogous to the secondary properties of the 
mechanistic conception. 

The change in conception is a radical one; but that is, after all, what 
scientific revolutions are all about. However, it should be clear that the new 
scientific situation no more requires the rejection of realism than did the 17th 
century scientific revolution. y states exist with a determinate nature quite 
independently of any cognition. One can draw the idealist conclusion only if 
one clings to the old conception of matter, taking the paradigm properties of 
the old conception to be the essential properties of matter itself. Nothing in the 
science requires us to do this. 


5 REALISM AND NON-LOCALITY 


Let me spell out some of the consequences of this view, particularly those 
concerning non-locality. Bell's experiment leaves a traditional realist little 
option but to accept some kind of non-locality. However, this should not be 
particularly surprising. After all, essentially the same conclusion can be drawn 
from a much more traditional experiment in quantum mechanics: the two slit 
experiment. I will first discuss this; then return to Bell's experiment. 

In the two slit-experiment a light is shone through parallel slits in a mask, 
and the resulting light falls on a screen. If one tries to understand the result in 
particular terms, it would appear that the light pattern on the screen ought to 
be the sum of the two patterns obtained from each slit independently; but it is 
not. The situation becomes extreme when the intensity of the light is reduced 
until only a single photon goes through the mask at any time. The result then 
ought to be either the result of the particle going through one slit, or the result 
of it going through the other; but in fact, it is neither. The particle ‘knows’ that 
one slit is open, even though it goes through the other. For the idealist of the 
Copenhagen variety, this poses no problems: since the particle is not observed 


34 Graham Priest 


to go through either slit, it in fact goes through neither. So the question of how 
it ‘knows’ that one slit is open when it goes through the other does not arise. To 
the traditional realist this is absurd; yet the only alternative seems to be that 
_ the open slit which the particle does not go through exerts some influence on it. 
Hence they are forced into accepting non-locality, or, to give it its traditional 
name, action-at-a-distance. 

Realists of the kind I have been describing are not forced into this situation, 
however. For they can agree with the idealist that the ‘particle’ has no position 
independently of being observed at the screen. Indeed, there is no particle in 
reality, just a y state determined by the light-source and mask. To suppose that 
one of the slits has an effect on a particle located elsewhere, is precisely to 
suppose that there is a particle in reality, and that its position is an intrinsic 
(non-observer-dependent) property. To invoke action-at-a-distance is, there- 
fore, to completely misunderstand the situation. The projection of these 
'secondary' qualities onto reality is Just as mistaken as Protagoras' projection 
of contradictory properties onto reality merely because things are perceived in 
contradictory ways by different observers.* As should be clear, however, the 
agreement with the idealist about the above does not entail agreement about 
idealism, any more than rejecting Protagoras' projection commits one to the 
view that all properties are secondary properties. 

The same point is to be made about the Bell experiment. To suppose that the 
observation of one particle affects the other at a distance is to presuppose the 
existence of particles with intrinsic Newtonian properties. But there are, in 
reality, no two particles with intrinsic positions. The real situation is described, 
quite literally, by a certain Ņ function. There is but a single state, and spin-at- 
point-A and spin-at-point-B are two of its observer-relative (Newtonian) 
properties. Thus, the Bell experiment may force us to give up the intrinsic 
nature of Newtonian properties, but it does not force us to give up realism. 

It remains true that there is a certain holism involved in quantum 
mechanics. For the result of an observation at point A, which depends, in part, 
on the observing set up there, can certainly have implications for the result of 
another observation at point B, a long way away. But that’s just what quantum 
mechanics tells us that reality is like. It may be surprising, but it does not reflect 
ill on the realism I have been suggesting. In particular, we are not forced Into 
the embarrassing position of speculating about the spurious mechanism 
involved, or, what this comes to, into non-local hidden variable theories. 


6 AN OBJECTION 


Before I conclude, let me answer an objection to this account." Take some 
state, V, and make the same observation on it twice. Suppose, for example, we 


* See, e.g., p. 20f of Kerford [1949]. 5 For which I am tndebted to an anonymous referee. 


Primary Qualities are Secondary Qualities Too 35 


measure spin. As long as nothing happens between observations, we will get 
the same answer both times (as quantum theory predicts). Now, if the spin is 
not an intrinsic property of the particle this ‘coincidence’ would not occur in 
general. Hence spin must be intrinsic. The major premise of this argument 
concerning spin is, however, false. Even though the spin is not an intrinsic 
property of the particle, none the less, the observed spin in a product of the y 
state and the measuring device, and if these are the same in both cases the 
result is, naturally, the same. After all, one observer (or two observers identical 
in the relevant respects) who makes two observations of the same (macro- 
scopic) object under the same conditions will see it as having the same colour 
both times. 

It may be replied that the state of the particle is not the same on the two 
occasions. The first observation will result in the state changing to a (different) 
eigenstate of the observation operator, y’. This may be so, but misses the point. 
Even in the case ofthe colour observation, the conditions under which any two 
observations are made are never identical; they need only be the same in the 
relevant respects (light conditions etc.). And y and w’ are the same in the 
relevant respects. What are the relevant respects? To answer this we can only 
be guided by the theory. This tells us that to some extent it may be a matter of 
chance what we observe on the first observation; but second and subsequent 
observations, acting on an eigenstate, must produce the same result. The 
theory therefore tells us that the states are relevantly identical. 

The objection is, in fact, doomed to failure for very general reasons: it is 
impossible to refute the interpretation of the formalism I have suggested on the 
grounds that it does not explain something, if that very thing is predicted by 
the formalism; for my suggestion is exactly to take the formalism seriously. It 
provides the one and only account of what is really happening. 


7 THE HIERARCHY OF MATTER 


It is tempting—a temptation to which I succumbed in giving this paper its 
title—to sum up the line of thought I have been suggesting, by saying that 
primary properties (such as spatial location) are really secondary properties; 
and there is some justice in this aphorism. However, it is also quite misleading: 
there are very real dissimilarities between secondary properties, as traditio- 
nally conceived of, and Newtonian states in quantum mechanics. For a start, 
although both kinds of property are (objectively) relative to an observer, for 
secondary properties the observer must be conscious; whereas for Newtonian 
properties the observer need not, as I have already noted. Secondly, traditional 
secondary properties are determinate functions of state-plus-observer, whilst 
Newtonian properties are non-determinate, probabilistic properties of state- 
plus-observer. 

Thirdly, and most importantly, the two kinds of property are observer- 


36 Graham Priest 


relative at different ‘levels’ of reality. Matter may be thought of as hierarchi- 
cally organized.® Its behaviour at each level may be explained in terms of the 
structure of the level below. Thus, the behaviour of macroscopic bodies and 
their properties is explained in terms of the (primary) properties of its 
microscopic (atomic) parts. The behaviour of these and their properties is, in 
turn, explained in terms of their quantum states and properties. Maybe the 
hierarchy goes further, though if it does, we haven't got there yet.” Now 
traditional secondary properties are propertles of matter at the macroscopic 
level; their perception in an observer is caused (in part) by a dispositional 
property to be understood in terms of structure at the microscopic level. 
Newtonian properties, on the other hand, are properties at the microscopic 
level (and so, derivatively, at the macroscopic level); their registering with an 
observer is caused (in part) by dispositional properties (or propensities) to be 
understood in terms of structure at the quantum level. This may be depicted in 
the form of a table: 














Level Characteristic property Produced by Dispositions are 
Macro secondary macroscopic aggregate 
disposition properties of 
plus observer primary states 
Micro primary (Newtonian) microscopic vector 
disposition properties of 
plus observer quantum states 
Quantum quantum ? ? 


If thereis a level below the quantum level then presumably there is an entry in 
the second, and perhaps the third, column of the third row also. At any rate, 
the table illustrates how sclence reveals more and more fundamental 
structures of matter. And properties taken to be absolute at one level may be 
found to be observer-relative at the next. 


8 CONCLUSION 


It is possible to see a good deal more of 20th century physics in this light. 
Quantum theory shows spatio-temporal locations to be observer-relative. But, 


5 See, e.g., ch 3, section 3 of Bhasker [1975]. 

7 Note that the levels are not defined tn terms of orders of physical magnitude. Though going 
down a level may involve moving to smaller entities, the hierarchy is defined in terms of causal 
hsec Note also that the discovery of the level below may well reshape our conception of 

e level above. 


Primary Qualities are Secondary Qualities Too 37 


in a sense, the Special Theory of Relativity shows space and time themselves 
(or, at least, spatial and temporal separations) to be ‘secondary properties’: 
frame-relative derivatives of absolute space/time (or proper time). However, 
realism was the topic of this paper; and relativity theory has never been 
considered to be a challenge to this in the same way quantum mechanics is. 
Thus, for the present at least, this matter need exist only in the mind of the 
reader. 


Department of Philosophy 
University of Western Australia 


REFERENCES 


Buasxar R. [1975]: A Realist Theory of Science, Harvester. 

KxnronD G. B. [1949]: ‘Plato’s Account of the Relativism of Protagoras’, Durham 
University Journal, pp. 20-6. 

Mermi N. D. [1981]: ‘Quantum Mysterles for Everyone’, Journal of Philosophy, 78, pp. 
397-408. 

Peraov S. [1985]: ‘Difficulties of Realism in Quantum Mechanics: Quite Natural’, paper 
presented at the 10th International Wittgenstein Symposium. 


Brit. J. Phil. Sci. 40 (1989), 39-68 Printed in Great Britain 


Distant Action in Classical 
Electromagnetic Theory 


BRENT MUNDY 


ABSTRACT 


The standard mathematical apparatus of classical electromagnetic theory in 
Minkowski space-time allows an interpretation in terms of retarded distant action, 
as well as the standard field interpretation. This interpretation is here presented 
and defended as a scientifically significant alternative to the field theory, casting 
doubt upon the common view that classical electromagnetic theory provides 
scientific support for the physical existence of fields as fundamental entities. The 
various types of consideration normally thought to provide evidence for the 
existence of the electromagnetic field are surveyed and analyzed in retarded distant 
action terms, from both a contemporary viewpoint and with regard to the late 19th 
century context within which the fleld theory was first generally accepted. It is 
concluded that acceptance of the field as real is not evidentially justified in either 
context, and that the customary historical explanation of the triumph of field 
theory as due to its empirical superiority is inadequate. An alternative explanation 
is suggested but not developed, appealing to non-empirical factors associated with 
the research program based on the conservation of energy. 


1 Introduction 

2 Retarded distant action theory 

3 The scientiflc status of flelds 

4 Medium-dependent wave propagation 

5 Retardation and space-time 

6 Energy and momentum of radiation 

7 Conservation of energy and momentum 
8 Conclusion 


1 INTRODUCTION 


The contrast between action at a distance (here abbreviated to ‘distant action") 
and fleld theories of electromagnetic and other interactions is familiar to 
philosophers of sclence ([Hesse 1961] remains the central reference on this 
topic). It is also well known that after a period of competition between field and 
distant action verslons of electromagnetic theory during the middle 19th 


40 Brent Mundy 


century, the field theory of Maxwell came to be generally accepted, and in 
modified form remains the standard version of classical electromagnetic 
theory. (The modern theory differs from Maxwell's theory in including the 
special relativistic kinematics and dynamics, and in treating matter and 
electric charge as discrete or atomic rather than continuous.) The decisive 
turning point in favor of the fleld theory is often taken to be Hertz's successful 
production and detection (in 1885-8) of electromagnetic waves traveling with 
the speed of light, as predicted by Maxwell's theory. 

On the basis of the above general picture, many philosophers interested in 
the ontological implications of physical theory seem to have drawn the 
conclusion that, at least at the level of classical theory, there are good scientific 
grounds for belief in the existence of electromagnetic fields: since the best 
theory is a fleld theory, flelds probably exist. (Of course this line of argument 
presupposes some form of realism with respect to theoretical entities; this will 
be assumed here.) This view is expressed for example by Salmon ([1984], pp. 
209-10, 241-2), in connection with his causal-mechanical account of 
science. It has also recently been used as an argument against the possibility of 
a relationist theory of space or space-time, on the grounds that unoccupied 
space-time points or regions must exist in order that fleld values may be deflned 
on them (Field [1980], p. 35; Field [1985], pp. 40—2; Teller, to appear). Here I 
will argue that this view is mistaken: at the level of classical theory there are 
insufficient scientiflc grounds for acceptance of the existence of electromag- 
netic flelds.!? 


! In the present paper ‘classical’ means ‘pre-quantum’ except when used to mark the contrast 
between classical and special-relativistic space-time theory. It should be obvious from context 
which sense is intended. 

There are several reasons for addressing this ontological question at the level of classical 

theory. First, classical considerations alone do often seem to be accepted as showing the reality 
of fields, in ways not invalidated by the subsequent advances of quantum theory. Second, the 
formal relations between classical and quantum field theories are strong enough that a study of 
the classical theory is likely to bear somehow on the quantum theory, whether or not the 
conclusions carry over directly. Third and most tmportant, there are serious problems in the 
physical interpretation of all forms of quantum theory, which make it hard to determine with 
any confidence just what the ontological implications or assumptions are. Since these problems 
are not present in the corresponding forms of classical theory, it may be better to try to establish 
the ontology using the classical theory and to address the problems of quantum theory at a 
later stage. 
Field ((1980], [1985]) stresses that if space-time points or regions are accepted as real objects 
then fields may naturally be analyzed as quantitative properties or relations of these objects 
rather than as separate entities in thelr own right. He says that on such an analysis the reality 
of flelds is a question of ‘tdeology’ (choice of predicates) rather than of ontology. I do not regard 
this distinction as relevant here. It is easy to reformulate physical theories so that the type- 
theoretic levels (and tn particular the roles of object and property) of what are intuitively the 
same entities are permuted; we may regard objects as properties of locations instead of locations 
as properties of objects, and so forth. This device appears to have first been used extensively by 
Whitehead ([1906] and later). In my view the question whether a particular component of a 
physical theory does or does not correspond to something physically real is of equal importance 
(and ontological significance) whatever may be the type-theoretic level to which it 1s assigned 
in a particular formalization of the theory. 


M 


~ 


Distant Action in Classical Electromagnetic Theory 41 


Section 2 outlines a retarded distant action formulation of modern classical 
electromagnetic theory. The central point is that the mathematical apparatus 
developed within the field theory itself allows for a translation to a distant 
acceptance of field theory in the late 19th century. The common explanation 
of this transition as due to the empirical superiority of Maxwell’s theory is 
therefore to be rejected. An alternative explanation is here suggested but not 
developed, appealing to non-empirical factors associated with the research 
program based on the conservation of energy. 

The remainder of the paper compares the retarded distant action theory and 
the standard field theory, and considers the consequences for the existence of 
fields. Section 3 discusses in a general way the inference from theory to 
ontology, and distinguishes between evidential considerations which support 
the truth of a theory and various non-evidential factors which may also favor 
one theory over another without supporting its truth, such as the relation of a 
theory to a research program. The remainder of the paper is devoted to a 
detailed discussion of the main types of evidence which have been thought to 
support field over distant action electromagnetic theory. These are: the 
dependence of electromagnetic effects upon properties of an intervening 
material medium (Section 4), the finite speed of propagation of such effects 
(Section 5), and the association of energy and momentum with electromag- 
netic processes (Sections 6 and 7). 

The discussion serves two distinct purposes. From a contemporary perspec- 
tive, it is useful to supplement the abstract assertion of empirical equivalence 
with a sketch of a modern distant action account of the main types of process 
which have been thought to require an analysis in terms of fields. From a 
historical perspective, it is also important to establish whether such accounts 
depend upon 20th century advances, or could reasonably have been offered in 
support of 19th century distant action alternatives to Maxwell’s field theory. I 
argue not only that the existence of fields is not evidentially Justifled in classical 
terms today, but also that it was not evidentially justified at the time of 
acceptance of field theory in the late 19th century. The common explanation 
of this transition as due to the empirical superiority of Maxwell’s theory is 
therefore to be rejected. An alternative explanation is here suggested but not 
developed, appealing to non-empirtcal factors associated with the research 
program based on the conservation of energy. 


2 RETARDED DISTANT ACTION THEORY 
The standard presentations of modern classical electromagnetic theory in the 


3 Hesse ([1955], p. 351) says, "The point at issue between the Continental school and that of 
Maxwell is partly a question of mathematical convenience, since formulation in terms of either 
action at a distance or fleld theory can be made to yield results that are confirmed by 
observations. . ..' Nagel [1961], pp. 395-6 makes a similar remark. The translation thesis is 
mentioned but rejected in Stein [1970], p. 283; Stein's arguments will be discussed below. 


42 Brent Mundy 


context of Minkowski space-time are based upon a skew-symmetric rank two 
tensor field F, whose six independent components represent the combined 
degrees of freedom of the non-relativistic electric and magnetic fields. (See 
Barut [1964], Rohrlich [1965] or Anderson [1967] for detailed develop- 
ments.) This field is supposed to be defined at every point of space-time, and to 
vary continuously across space-time in accordance with the partial differential 
equations of Maxwell (which are formally the same in special relativity as in 
classical space-time). The law of motion fixes the electromagnetic acceleration 
experienced by a test particle p at a space-time point x as a function of the mass, 
charge and vector velocity (space-time direction of motion) of p and the value 
of the fleld F at the point x. Electric currents occur as source terms in the 
Maxwell equations, linking the field values to matter. The modern theory 
differs from Maxwell's in postulating that the source currents ultimately 
consist of individual charged particles in motion. 

Because Maxwell's equations are linear, the total field may be analyzed as a 
sum of contributions F, due to individual source particles p. The contribution of 
p to the value F(x) of the field at point x is found to be a sum of two terms F +(x) 
and F ;(x), of which the first depends only upon the charge, position, velocity 
and acceleration of the particle p at its point of intersection with the backward 
light-cone of the point x, and the second depends upon the same variables at its 
point of intersection with the forward light-cone of x.* Because of the linearity 
of the equations, F+, and F}; separately as well as all linear combinations of 
them are solutions. On the assumption that electromagnetic effects are causal 
in character and hence proceed from past to future, the normal practice is to 
select the retarded solutions Ft, only, according to which the field F, depends 
only on past positions of p, not on future ones. On the further assumption that 
electromagnetic fields are always derived from material sources rather than 
‘coming in from infinity’ (for example one could add to any solution the 
solution representing a uniform plane wave filling all of space-time), the total 
field F is uniquely determined by the source particle motions and the field 
equations. 


* F4(x) is equal to the skew-symmetric part of the rank two tensor 
2e((1/s2)ri - (1/s?)r8 —(<r,8> /s?)ri), 


where juxtaposition represents the outer or tensor product of 4-vectors, < > is the 
Minkowski inner product, t and # are the 4-vector velocity and acceleration respectively of the 
source particle p at its point z of intersection with the light-cone of the point x, e is the charge of 
p. r is the null vector from z to x, and the scalar s= «rx is the spatial distance between z and x 
as determined in the source-particle frame at z. (Tensor indtces and the distinction between 
covariant and contravariant components of the same vector have been suppressed; otherwise 
this expression for F is taken from Anderson [1967], p. 220.) F(x) is defined in the same way 
with a change of sign. The approximate law of motion used in most applications has the form 
mi = eF(x)[x], where & is the 4-acceleration of the test particle, m and e are its mass and charge, 
and F(x)[x] is the contraction of F(x) with the test-particle 4-velocity x to produce a 4-vector. 
Because F is skew-symmetric, F[x] is always Minkowskl-orthogonal to €. The exact law of 
motion incorporates a radiation reaction term, and is discussed in Section 6. 


Distant Action in Classical Electromagnetic Theory 43 


The preceding account may easily be modified to yield a retarded distant 
action version of relativistic classical electromagnetic theory. To do so, we 
replace the single continuous tensor fleld F(x) filling all of space-time by a 
retarded tensor force F. Like the vector gravitational force of classical gravitation 
theory, the tensor force F acts directly between pairs of particles at particular 
points in their history, without any reference to continuous fields. Since we are 
working in Minkowski space-time rather than classical space-time, however, 
the action is along the light-cone rather than being instantaneous. The force 
law governing F says that for each pair of events in which a particle q intersects 
at a space-time point x the forward light-cone of a particle p, p exerts upon q a 
force equal to F +(x) as defined previously. Since the tensor force F is determined 
by this law directly from the kinematic relations of the particles p and q in 
Minkowski space-time (together with invariant parameters such as mass and 
charge), this is a distant action force law, analogous to Newton's inverse square 
law for the vector gravitational force. The law of motion remains unchanged, 
with the total force F(x) on a test particle at x taken as the sum of the forces 
F*(x) from all source particles p. 

The final tensor term F entering into the law of particle motion will be the 
same for both theories, and thus the particles will execute the same motions. 
The only difference is the way in which this term is determined. For the field 
theory it is taken as the local value of a single continuous tensor field covering 
all of space-time, while for the distant action theory tt is a sum of distinct 
retarded tensor forces exerted separately by the distinct sources particles as 
they intersect the past light cone of the test particle at the point x. From the 
viewpoint of the distant action theory the introduction of the continuous 
tensor field F(x) and the partial differential equation governing it are simply an 
alternative mathematical device for specifying the tensor forces between 
particles, just as for Newtonian gravitation theory the introduction of a 
continuous scalar field of gravitational potential satisfying the Poisson 
equation is an alternative mathematical device for specifying the inverse- 
square gravitational forces between particles. In both cases this device may be 
useful for developing the deductive consequences of the underlying distant 
action theory (this will be illustrated later for the electromagnetic case), but in 
neither case need the fleld be taken as physically real. It is only a historical 
accident that for gravitation the distant action formulation was found first, 
while for electromagnetism the field formulation was found first. 

Since electromagnetic fields are observable only through the effects which 
they produce on matter, the fact that these two theories yield the same laws of 
motion for material particles implies that they are empirically equivalent. This 
in Itself is a sufficient demonstration of the possibility of a distant action 
formulation of classical electromagnetic theory. However, itis useful to discuss 
in more detail some important formal respects in which this relativistic distant 
action theory differs from the Newtonian gravitational model. 


44 Brent Mundy 


First, there is the use of a tensor rather than a vector force. This difference 
may be eliminated by identifying as the force exerted by p located at z upon q 
located at x the vector quantity F*(x)[X] which enters directly into the law of 
motion for q, rather than the full tensor F*(x). In this way the law of motion 
assumes the familiar Newtonian form as an equation relating vector force 
directly to vector acceleration. The difference re-appears, however, in the fact 
that the vector force exerted by p on q now depends upon the velocity x of q as 
well as its position, in contrast with the Newtonlan situation. Some such 
dependence is inevitable for a relativistic theory, either directly as here, or 
indirectly as in the use above of a tensor force acting by means of a non- 
Newtonian law of motion. This is because relativistic kinematics requires the 
4-acceleration of a test particle to be othogonal to its 4-velocity, and therefore 
(unlike in the Newtonian case) the various accelerations induced by a source 
particle located at z on test particles at a fixed location x must vary in space- 
time direction depending on the velocity x of the test particle. This non- 
Newtonian dependence on the velocity of the test-particle may be introduced 
elther in the law of motion (the first alternative chosen above), or in the force 
law. 

Second, it is clear that the retarded force law will not directly satisfy the 
Newtonian requirement of equality of action and reaction, since the force 
exerted by p located at z on q located at x would have to be balanced by a force 
exerted on p located at z by q located at an earlier point x’ in its history. The 
force law alone does not guarantee that these forces will balance. However, the 
above force law does imply that when the two particles are close together or 
have small relative velocities then the forces which they exert upon one 
another during the same time interval will tend to be equal and opposite, so 
that the law of action and reaction is approximately satisfied under those 
conditions. The present feature of the distant action theory is therefore not in 
direct conflict with the observations supporting that law. (As noted in Hesse 
[1961] p. 137, Newton’s a priori argument for the law depends upon the forces 
acting instantaneously. The difficulty of reconciling electrodynamics with the 
law of action and reaction and with the associated conservation laws of energy 
and momentum will be discussed further in Section 7.) 

Third, it should be noted that the technical problem of determining the 
actual motion of a system of particles interacting under the retarded distant 
action forces is fundamentally different in character from that of Newtonian 
mechanics. Because of the retardation, in whatever coordinate system the 
dynamical problem is formulated one will find particle positions at one time 
coupled with forces exerted by other particles at earlier times. For this reason 
one cannot formulate the equations of motion as a system of ordinary 
differential equations of second order in an independent time variable, having 
a unique solution for any specification of the 6n values (for an n-particle 
system) of particle position and velocity at some fixed time. Instead, we require 


Distant Action in Classical Electromagnetic Theory 45 


as initial data the n continuous particle histories over some time interval of the 
order of s/c where s is the maximum separation of any two particles of the 
system during this interval, and c is the speed of light. Because of the continuity 
of particle motion, these initial data are not in general equivalent to any finite 
set of real numbers. 

The infinitary character of the dynamical problem of electromagnetic theory 
is also reflected in the field formulation, in a different way. In field theory one 
may take the initial data at a single time or along a single spacelike 
hypersurface spanning the past light-cones of all of the particles: the spatial 
continuum of fleld values across this hypersurface replaces in the field model 
thetemporal continuum of information about the past particle positions which 
determined those field values by the field equations. (In both cases the 
continum may be reduced to a countable infinity by Fourier analysis.) 

Here we see an ambiguity in the classical concept of the number of degrees of 
freedom of a system. In both formulations an electromagnetic system appears 
as dynamically infinitary in the sense that a system-history cannot be fixed by 
the dynamical laws together with a finite number of real-valued initial 
conditions. By contrast, Newtonian particle systems are dynamically finitary, 
having 6n dynamical degrees of freedom. Only in the field formulation, 
however, does an electromagnetic system also appear as statically infinitary, in 
the sense that a complete description of the system at a single time also requires 
infinitely many real numbers, because of the new infinitary physical object 
(the field) added to the system. In the distant action formulation, by contrast, 
an electromagnetic system remains statically finitary (having 3n static degrees 
of freedom associated with the positions of the n particles, just as with 
Newtonian mechanics). The infinitary dynamical behavior is seen to be a 
consequence of the looser and more complex character of the dynamical laws 
governing these finitary objects, so that the temporally continuous history of 
their finitary properties can no longer be deduced using those laws from a finite 
set of initial values, in contrast with the Newtonian case. The problem appears 
as one of deductive weakness in the laws, rather than of increased intrinsic 
complexity of the physical objects which they govern. Since the statically 
infinitary character of the field itself as a physical system is often thought to be 
partially responsible for serious mathematical difficulties in both classical and 
quantum field theory, this aspect of distant action theorles is one source of their 
appeal to physicists working on the foundations of particle dynamics. 


3 THE SCIENTIFIC STATUS OF FIELDS 


The conclusion I wish to draw from the foregoing considerations is that, at the 
level of classical physics, there are insufficient grounds for belief in the physical 
existence of flelds. Here I am assuming, as seems implicit in the ontological 
arguments mentioned in Section 1, that ontology may be inferred from 


46 Brent Mundy 


fundamental theory: the way to decide on scientific grounds what objects 
probably exist is to see what objects are assumed to exist in the best 
fundamental theory available. For this inference it is of course essential that 
the theory be regarded as fundamental, and that the objects be taken as 
primitive in the formulation of the theory: objects which are defined or 
constructed rather than taken as primitive may indeed also be said to exist, but 
not in the sense in which the existence of fields is here being debated. It is in this 
secondary sense of being definable within or reducible to a more fundamental 
theory that minds may be said to exist in a materialist theory, heat may be said 
to exist in kinetic theory, and space-time points may be sald to exist in a 
relationist theory of space-time. The controversial existence claim associated 
with the positions of Cartesian dualism, caloric theory and the ‘substantival’ 
theory of space-time is not that these entities are definable within the true 
fundamental theory but rather that they must be taken as primitive within it. 
Similarly, it is uncontroversial that within a distant action theory one may 
define functions F(x) of space-time points which will have all of the formal 
properties ascribed to fields on the field theory. Thus there is no question that 
fields exist in the same secondary sense that minds and heat do; the question is 
only whether they exist in the fundamental sense that electrons do.? 

The considerations of the previous section seem to undermine the argument 
for the existence of the field, by showing the existence of an alternative or 
competing fundamental theory which is as well supported or as likely to be 
true as the field theory but which does not assume the existence of the field. 
This conclusion is not based on a detailed theory of scientific support, but 
simply upon the plausible assumption that scientific support must be some 


5 Moreover, defined objects may well satisfy genuine laws having counter-factual force: 
psychology and thermodynamics are perfectly legitimate sciences even though today we 
believe them to deal with non-fundamental objects. Laws governing non-fundamental objects 
are expected ultimately to appear as consequences of the fundamental laws together with the 
analysis of the non-fundamental objects on the basis of the fundamental theory, but may be 
formulated and tested before such a reduction is available. For this reason, the argument of 
Stein [1970] seems to me to establish only the existence of flelds in the secondary sense. Stein 
contends that Newton's inductive argument to the law of gravitation ts coherent only if 
understood as referring to the single reference class of the values of the gravitational 
acceleration which would be experienced by a test particle at a given space-time point, and that 
the function giving the totality of these values is a field, so that Newton's argument depends 
upon the existence of fields. This argument seems only to show the existence of flelds in the 
derivative sense, since the induction would be equally sound if it referred to fields as defined 
rather than as fundamental enttties. For example, inductive arguments in thermodynamics 
presuppose the existence in the derivative sense (and the suitability as a reference class for 
induction) of something having the properties ascribed to heat, but need not assume a caloric 
fluid to exist as primitive or fundamental. It ts clearly not existence in this sense which is 
required for example tn Field's argument from the existence of fields to the existence of space- 
time points, since he admits at the outset (Field [1985], p. 33) that for relationism space-time 
points exist as defined entities, and thus he would presumably also allow a relationist to define 
field-values at the defined points. Field wishes to infer the existence of space-time points or 
regions as fundamental, and therefore must also be assuming the existence of fields as 
fundamental, on the basis of electromagnetic theory. 


Distant Action in Classical Electromagnetic Theory 47 


function of confirmed empirical consequences together with simplicity, so that 
if two theorles are empirically equivalent and comparable in simplicity then 
they must be comparable in degree of support. 

The factor of simplicity is important here, for it is that which rules out as 
irrelevant the spurious idealist or phenomenalist reductions of everying to an 
ontology of experience or sense-data. (Similar remarks apply to Cartesian- 
demon or brain-in-vat theories, and to the replacement of a theory by the set of 
its empirical consequences.) It may indeed be true that there exists a theory 
which is empirically equivalent to the materialist theory but which takes as 
primitive only terms referring to experiences (or to their demonic or vat-related 
causes), but this does not weaken the scientific argument for the existence of 
matter, because any such theory is so much more complex than the materialist 
theory that it cannot be regarded as equally well supported by their common 
body of data. 

Here I do not wish to argue that the distant action theory is better supported 
than the fleld theory (though I think that such a case could be made in view of 
the complex character of the field ontology), only that their support is 
comparable. The comparison between the two theories is analogous to that 
between the one-fluid and the two-fluid theories of electricity prior to the 
discovery of the electron (Whittaker [1951], pp. 57-9), or between the 
electrodynamic theories of Lorentz and of Einstein (Miller [1981]): there is a 
small difference in formal simplicity associated with an extra item of ontology 
(the second fluid, Lorentz’s ether, or the field respectively), but a complete 
agreement in empirical consequences. As the example of Lorentz's ether 
shows, such differences may tip the scales in the long run. But the conflict 
between field and distant action ontology at a classical level was disrupted by 
the emergence of quantum theory before the issue had time to resolve itself. 

Because of the equal or greater simplicity of the distant action theory, I 
cannot accept Stein’s characterization ([1970]), p. 284) of this mode of 
elimination of fields as ‘a philosophically specious quasi-positivist reduction’. I 
agree with Stein that positivism can eliminate anything, but only at the cost of 
greatly complicating the theory. The sort of elimination which is scientifically 
significant is that which succeeds without complicating the theory appreciably, 
such as Einstein's elimination of Lorentz’s ether or Lavoisier’s elimination of 
Stahl's phlogiston. The elimination of the field in favor of distant action has this 
character, and therefore is not mere positivism. 

In view of the foregoing, we are naturally led to ask how the field theory 
came to be and to remain so universally accepted, and whether the 
considerations responsible for that acceptance do not perhaps sttll somehow 
bear in a significant way against the present distant action theory. This 
complex issue will be addressed in the remainder of the paper. The historical 
discussion is tentative, and based partly upon secondary sources. 

The easiest explanation of the historical success of field theory would be 


48 Brent Mundy 


simply that Maxwell’s theory was a field theory and that no equally well 
supported distant action theory was available. However, this explanation 
seems insufficient. It appears (Whittaker [1951], pp. 267-70) that the 1867 
theory of L. Lorenz, based on the retarded potentials, could easily have been 
given a distant action interpretation equivalent to that of the preceding section 
(except for being formulated, like Maxwell’s theory, within classical space- 
time), though Lorenz himself did not do so. Moreover, (D'Agostino [1975], pp. 
268-9), it seems that C. Neumann in 1868 explicitly embraced the concept of 
retarded distant action, though apparently not in connection with Lorenz's 
potentials. Thus the conceptual elements of a retarded distant action theory 
equivalent to that of Section 2 (and hence to the modern form of the classical 
theory) were available at an early stage, well before field theory was generally 
accepted. What requires historical explanation is why such a distant action 
theory did not come to be recognized as a viable alternative to Maxwell's 
theory. 

Several types of consideration have been brought forward at various times 
as evidence for the truth of the field theory. In sections 4—7 I shall discuss these 
in turn, from both the 19th century and the contemporary viewpoints. The 
goal is, first to confirm in detail that from a modern viewpoint these 
considerations do not justify the acceptance of electromagnetic fields as real, 
and second, to discuss the extent to which these considerations provided such 
justification within the 19th century context. 

It should be stressed that in both contexts we are concerned only with the 
evidential relevance or degree of support provided by these considerations, i.e. 
the degree to which they provide grounds for supposing field theory more likely 
to be correct than distant action theory. One theory may well have scientifically 
relevant but non-evidential advantages over another, connected for example 
with ease of computation or vividness of representation, or (as stressed in Hesse 
[1961]) a greater capacity to suggest lines of modification and development, 
for example by reference to an intuitive model. Such advantages constitute 
good scientific reasons for prefering to work with one theory rather than 
another, but do not provide any evidential grounds for deeming that theory 
more likely to be correct. The issue here is the evidence for the physical 
existence of fields, not the scientific value of field theory in this broader sense. 

A particularly important class of scientifically relevant but non-evidential 
reasons for preferring one theory over another are those based on the degree to 
which a theory fits into a particular research program. As discussed in detail by 
Kuhn, Lakatos and others, the reasons for adopting a particular research 
program are usually not directly evidential in character, but rather derive from 
personal or cultural predisposition toward a particular type of theory, 
reinforced by analogical arguments from prior successes of theories of that type 
in domains far enough removed from the one in question that success there 
carries little or no direct evidential weight. These authors have argued 


Distant Action in Classical Electromagnetic Theory 49 


convincingly for the scientific value of commitment to a research program by 
individual scientists, as a source of continuity and direction in their research 
efforts: better results are obtained in the long run if a variety of research 
programs are sustained continually until one of them produces a theory with 
overwhelming evidential superiority, than if the entire scientific community 
focuses upon the single theory which happens to have the greatest support at a 
given time, to the neglect of tenable but less well supported alternatives. 
Variety in research programs ensures a continued supply of developed 
alternatives, and increases the chances for the community as a whole to reach 
the truth, even while decreasing the chances of most individual researchers to 
be the successful discoverers, by motivating them to pursue theories with less 
evidential support than some of the alternatives. Because of this legitimate and 
indeed essential scientific function of research programs, programmatic 
considerations must certainly be accepted as good scientific reasons, despite 
their general lack of evidential force.® 
Maxwell’s theory belonged to the 19th century research program (pursued 
mostly in Britain) aimed at the development of a mechanical analysis of the 
electromagnetic ether (Whittaker [1951], Schaffner (ed.) [1972], Stegel 
[1981]). The program was a very attractive one at the time, with analogical 
motivation from the success of Newtonian mechanics, and the striking 
ambition of unifying physical theory by reducing the whole of optics and 
electromagnetic theory to mechanics. On the other hand it had no real 
evidential support when it gained ascendancy in Britain, and there was 
nothing unscientific in the simultaneous adherence on the Continent to an 
alternative program based on distant action (Woodruff [1962], Wise [1981]). 
The acceptance of Maxwell's theory in Britain may adequately be explained 
by its conformity to the prevalent research program of ether mechanics, 
without attributing to it any definite evidential superiority over distant action 
competitors. It is the acceptance of Maxwell’s theory on the Continent, where 
it was not in agreement with the predominant research program in 
electromagnetic theory, which is normally explained by appeal to a supposed 
evidential superiority of Maxwell's theory over all distant action competitors, 
$ The particular model sketched here of the role of research programs in science deviates from 
those of Kuhn and Lakatos in assuming that the scientific support of theories produced within 
competing research programs may be compared In an objective manner, by reference to their 
Intrinsic simplicity and degree of confirming evidence. Kuhn denies the possibility of any sort of 
objective comparison of such theories (the doctrine of incommensurability), and Lakatos adheres 
to a form of falsificationism for research programs, wherein a program is abandoned because it 
fails to produce the right kind of results (ts degenerating rather than progressive). Kuhn's 
conception of revolution as preceded by crisis also has a falsificationist character. The present 
model, by contrast, is a form of inductivism for research programs, wherein a program is 
supported by its success in producing well supported theories, rather than falsified by its failure 
to do so. I would also reject Lakatos's requirement that a successful research program produce 
theories predicting new empirical phenomena: on an inductivist concept of support a research 
program may also succeed by producing a theory which is better supported by the existing data 
than are any of its competitors. 


50 Brent Mundy 


decisively demonstrated by Hertz’s production of electromagnetic waves. This 
account fits the model of science mentioned above: the competing research 
programs are pursued independently until one of them produces a theory 
having evidential superiority strong enough to overcome the force of partisan 
commitment, thereby converting the opposition and leading to the demise of 
the competing programs. This common historical explanation of the success of 
field theory on the Continent is called into doubt by the present denial that 
Maxwell's theory possesses the required evidential superiority over retarded 
distant action theory. An alternative explanation will be suggested later. 

I pass now to the survey of considerations thought to provide evidence in 
favor of field over distant action electromagnetic theories. 


4 MEDIUM-DEPENDENT WAVE PROPAGATION 


The first family of considerations centers around Maxwell's successful 
development of an electromagnetic account of light on the basis of his field 
theory, and the empirical confirmation of this account through Hertz's 
successful production and detection of electromagnetic waves. Indeed, Hertz's 
work seems to have been widely accepted as a crucial experiment decisive in 
favor of fleld over distant action models. (Whittaker [1951] p. 328, Miller 
[1981], p. 93, Harman [1982] p. 109; D'Agostino [1975] gives a detailed 
analysis of Hertz's research on electromagnetic waves and its contemporary 
theoretical context.) It is not necessary here to distinguish between the high 
frequency waves which Maxwell conjecturally identified with light waves and 
the lower frequency waves which Hertz actually produced and detected, since 
Maxwell's theory derives empirical support in a similar way from both sources. 
Hertz's observations provided more direct support because the electromagnetic 
theory of light requires the additional hypothesis that light waves are 
generated and absorbed through interaction with electric currents, whereas 
for Hertz's waves this was directly shown by the experiments. 

This and the next section deal with what may be called the kinematic 
properties of electromagnetic processes, especially waves. By this I mean the 
propertles of the system of electromagnetic forces considered simply as a 
geometric object defined on space-time by the field equations or force law (in 
mathematical terms of a tensor field), without regard to its mechanical effect 
upon material systems through the law of particle motion. Sections 6 and 7 
deal with the latter category of energetic properties of electromagnetic 
processes. 

From Maxwell's equations it follows that changes in source currents are 
associated with changes in their electromagnetic fields, which propagate as 
transverse waves with a speed numerically close to that of light. Since the flelds 
are observable through the accelerations which they induce in charged test 
particles or the polarization which they induce in matter, the waves are 


Distant Action in Classical Electromagnetic Theory 51 


observable as changes in these quantities; when the source process is periodic 
the test process will manifest the same periodicity. Moreover, in material media 
the propagation process is affected by the electromagnetic characteristics of 
the medium in a way which accounts for refraction and dispersion. All of these 
effects were produced and observed by Hertz in 1885-8. These properties of the 
waves depend only upon their kinematic description as continuously propa- 
gated changes in the electric and magnetic field vectors, and follow from 
Maxwell's equations alone. 

From a modern perspective, there seems to be little doubt that a retarded 
distant action electromagnetic theory such as that given in Section 2 can 
account for the kinematic properties of electromagnetic waves as well as 
Maxwell's theory does. For waves propagating in empty space this follows 
directly from the mathematical relations between the two theories: the 
retarded forces due to an accelerated source particle on the distant action 
theory will display the same periodicity and transverse character as predicted 
by Maxwell's theory, hence will produce the same frequency-dependent effects 
on a test particle or circuit in accordance with the law of motion. 

Waves propagating in material media appear at first glance to present a 
much more serious difficulty, since according to Maxwell's theory the kinetic 
properties of the propagation (refraction, dispersion) will depend upon the 
properties of the medium, not merely upon the source current. Indeed, such 
dependence of electromagnetic effects upon intervening media had been one of 
Faraday's arguments for the reality of an ether (Hesse [1961], pp. 198-9, 
203-6). However, this difficulty has been resolved by the subsequent 
development of accounts of macroscopic radiative processes in material media 
by classical electron theory, on tbe basis of Maxwell's equations applied at the 
level of single charged particles (e.g. Lorentz [1909], Rosenfeld [1951]). While 
such accounts are customarily presented in the context of a field interpreta- 
tion, their results are equally applicable on a retarded distant action 
interpretation, so long as they depend only upon Maxwell's equations and the 
law of motion applied at the level of individual charged particles. 

The electron-theoretic analysis of radiative processes may be translated into 
distant action terms along the following lines. The periodic acceleration of the 
source particle causes, by the direct retarded force law, corresponding periodic 
forces in all other particles at distance s at times s/c later. The resultant 
accelerations of these particles will contribute to the retarded forces which they 
produce (in fleld language, the particles 'emit secondary waves'). These 
secondary effects will have a phase relation to the source which depends upon 
the distance s and the source frequency, and an intensity which depends upon 
the distance s. If we are observing a radiative effect of a source S on a test 
particle T and there is also a material body or medium M (a sufficiently large 
aggregation of elastically bound charged particles) sufficiently near both S and 
T (and therefore sufficiently near the line joining them), then the combined 


52 Brent Mundy 


secondary effects S->M->T may be comparable in magnitude to the direct 
effect S— T. Because of the finite extent of M these secondary forces will have 
different phase relations to the source S. In the rigorous electron-theoretic 
analysis of refraction and dispersion (e.g. Rosenfeld [1951], Ch. VI, ‘extinction 
theorem’) it is proven that the combined secondary effects of M on T are 
precisely sufficient to cancel or ‘extinguish’ the direct effect of S on T and to 
produce a resultant effect which corresponds in frequency and spatial 
dependence to what in the macroscopic Maxwell theory would be described as 
‘the wave from S refracted through the medium M reaching the particle T’. 
Since this analysis depends only upon the kinematic features of electromag- 
netic wave propagation as derived from Maxwell's equations, it holds equally 
well for a distant action interpretation of those effects. In this general way all of 
the kinematic dependencies of classical electromagnetic wave propagation 
upon intervening material media may be accounted for by a retarded distant 
action theory, on the basis of classical electron theory. 

The empirical discoveries and theoretical developments in the electomagne- 
tic theory of matter and classical electron theory upon which this distant 
action account of electromagnetic wave propagatlon relies were not available 
in the x9th century. On the other hand such developments were certainly 
anticipated, and with good reason. To account for the emission and absorption 
of light by matter at all on Maxwell's theory it is necessary to postulate the 
existence of charges elastically bound within matter, and attempts to account 
for the macroscopic optical properties of matter in terms of electromagnetic 
theory applied to such charges date from a proposal of Stokes in 1852, and 
work of Maxwell himself in 1869 (Whittaker [1951], pp. 262-4). Moreover, 
advances in electrochemistry in the first half of the century had provided 
strong independent support for the existence of electrical properties of matter 
on a molecular scale. Therefore, insofar as Maxwell's theory was believed to be 
capable of eventually offering such an explanation of electromagnetic 
properties of material media, including their influence on the propagation of 
electromagnetic effects, a distant action theory equivalent to Maxwell's for 
particle interactions could have been recognized as equally capable of doing so. 
(The discussion of Lorenz in Whittaker [1951] p. 270 suggests a specific case in 
which this point was overlooked.) As above, one need only construe the theory 
which is applied to the fundamental particles as a retarded distant action 
theory rather than a fleld theory. 

In this regard the essential difference between Maxwell's original theory and 
distant action competitors was simply the macroscopic character of Maxwell's 
theory, in which electromagnetic properties such as susceptibility to polariza- 
tion were directly ascribed to matter in bulk, rather than being explained on a 
molecular basis. In contrast, distant action theories have always been 
formulated in terms of individual particles. This might have been a non- 
evidential but scientiflcally relevant advantage of the Maxwell theory within 


Distant Action in Classical Electromagnetic Theory 53 


the 19th century context, since it allowed the study of such macroscopic 
properties of matter to be pursued independently of molecular problems. But of 
course this ts no more evidence for the existence of fields than the utility of 
Fourier’s macroscopic theory of heat conduction is evidence for the existence 
of a continuous caloric fluid, or hydrodynamics evidence for the continuity of 
matter. It remains true that once the macroscopic theory has successfully been 
reduced to a fundamental particle theory, it may equally well be recast in distant 
action terms. 

For these reasons, it does not appear that the medium-dependent propaga- 
tion of electromagnetic effects should be counted as providing significant 
sclentific evidence in favor of field rather than retarded distant action theory, 
even within the 19th century context. The program of molecular reduction 
being carried out on the basls of Maxwell's theory could equally well have been 
cast in retarded distant action terms, and in this way phenomena such as those 
discovered by Hertz could perfectly well have been accounted for on a distant 
action basis. 


5 RETARDATION AND SPACE-TIME 


A separate kinematic consideration is the status of retarded as opposed to 
instantaneous distant action. The immediate physical conclusion of Hertz's 
experiments was not that fields or an electromagnetic ether exist, but rather 
than electromagnetic effects are propagated with a finite velocity, on the order 
of that of light. This conclusion was established on the basis of interference 
experiments: since the phase of the oscillatory inductive effect depends upon 
the path of propagation from the oscillating source to the point of detection, the 
propagation process is inferred to require a finite time depending on the length 
of the path, Le. to possess a finite velocity. The fact of finite velocity is 
established by Hertz with great care and detail, and forms the central 
conclusion of his work on electromagnetic waves." 

This result falsifies instantaneous distant action theorles, but does not bear at 
all against retarded theories such as that of Section 2. Hertz's discoveries are 
thus sufficient to undercut the distant action program, only insofar as there 


7 Hertz chose for the 1892 volume reprinting his papers on this topic the title ‘Investigations 
concerning the Propagation of the Electric Force'. The 1888 paper forming the keystone of this 
work is entitled, ‘On the Velocity of Propagation of Electromagnetic Action’. (These titles are 
rendered somewhat loosely In the 1893 English translation, which is unreliable in a number of 
places.) In the opening sentence of the Foreword to Hertz [1892] (not included in the English 
translation), Hertz refers to the papers as, ‘my works on the temporal propagation of electrical 
force and on electric oscillations’. In the 1892 Introduction surveying the work reported in the 
original papers he says, ‘... by the experiments above sketched the propagation in time of a 
supposed action-at-a-distance is for the first time proved. This fact forms the philosophic result 
ofthe experiments; and, indeed, in a certain sense the most Important result.' (Hertz [1893], p. 
19). 


54 Brent Mundy 


were independent grounds for restricting attention to instantaneous rather 
than retarded distant action theories. 

It is indeed the case that the best-known 19th century distant action 
theories and research programs (for example that of Weber) were based on 
instantaneous action, and this is the aspect of distant action research which 
has received the most attention from historians and philosophers. (For 
example, in the survey of the 19th century distant action electromagnetism of 
Woodruff [1962] theortes involving retarded action are deliberately omitted 
from the discussion.) From our present perspective, however, it is the retarded 
theories such as those of Riemann and of C. Neumann which appear as most 
important, and the crucial question is whether within the 19th century 
context they represented a scientifically viable line of research. 

Of course It is possible to reject retarded action on the a priori grounds that a 
cause cannot act after it has ceased to exist, just as it is possible to reject 
instantaneous distant action on the grounds that a cause cannot act at a place 
where it is not present. In fact an argument of this form was used by Hertz 
himself to infer the reality of the states of polarisation of empty space from the 
fact of retardation. But there is no more reason to accept such an a priori 
principle in the temporal case than there is in the spatial case. What we wish to 
know 1s whether there were substantial scientific grounds for the rejection of 
retarded action in the 19th or 20th century contexts, not dependent upon 
such metaphysical principles or upon commitment to a particular research 
program. 

An argument against retardation which springs immediately to mind for a 
modern philosopher of science derives from the structure of classical space-time 
(Sklar [1974], Friedman [1983]). Within that framework the only type of 
distant action forces which are geometrically natural are those which act 
instantaneously, that is, along the hypersurfaces of absolute simultanelty; the 
introduction of a retardation associated with the speed of light appears 
unnatural and ad hoc. Moreover, such a theory is to some degree self-defeating 
in that the velocity determining the degree of retardation must be defined with 
respect to some privileged frame of reference in classical space-time, and the 
only obvious candidate is the one which performs this function for the 19th 
century field theories, namely the rest-frame of the ether. The very existence of 
a distinguished finite speed implies, in classical space-time, the existence of a 
distinguished frame of reference (namely, the unique frame tn which those 


8 ‘The most direct conclusion [from the finite velocity of propagation of inductive action] is the 
confirmation of Faraday's view, according to which the electric forces are polarisations existing 
independently in space. For in the phenomena which we have investigated such forces persist 
in space even after the causes which have given rise to them have disappeared. Hence these 
forces are not simply parts or attributes of their causes, but they correspond to changed 
conditions of space.’ (Hertz [1893], pp. 122-3, from the conclusion of the 1888 paper in which 
the finite velocity of propagation was first established; my italics and interpolation in square 
brackets.) 


Distant Action in Classical Electromagnetic Theory 55 


effects are found to propagate with that speed in all directions), and therefore at 
least makes plausible the existence of an ether, which is what one is trying to 
deny. I do not know whether this kinematic argument for the existence of an 
ether was ever explicitly formulated in the 19th century. (The discussion of 
Hirosige [1976] pp. 42-3 suggests not.) It is implicit in the remark of Einstein 
[1905] (reprint, p. 38) that special relativity made the ether superfluous in the 
sense of no longer requiring a distinguished frame of reference with respect to 
which electromagnetic processes are to be described. 

However, I doubt whether this argument carries much force in the 19th 
century context. In the first place, it overlooks the fact that within that context 
it was already accepted on grounds independent of electro-dynamics that there 
exists a privileged frame of reference, namely that of absolute space, as shown 
to exist by Newton's argument from absolute acceleration (the famous ‘bucket 
argument’). Only after 20th century analysis and criticism has it been 
concluded that Newton’s argument proves only the existence of an affine 
connection on space-time, not also the existence of a distinguished family of 
timelike geodesics of that connection, defining a distinguished frame of 
reference. Newton had shown that this frame could not be identified by 
mechanical experiments in consequence of what we call the Galilean 
invartance of Newtonian mechanics, but he was also thought to have shown 
that it must exist, and so it need not have been a great shock to discover that it 
could be detected by electromagnetic or optical means; indeed, this might have 
been seen as a striking confirmation of Newton’s analysis of space-time. 

Thus a retarded distant action theorist need not have conceded the existence 
of a mechanical ether at rest in absolute space, even within the context of 
classical space-time. The distinguished timelike lines along which optical and 
electromagnetic effects propagate would simply define an additional element of 
geometric structure within classical space-time, yielding what is called 
Maxwell geometry in Mundy [1986]. By virtue of this geometric structure alone 
it would follow that motion of an observer with respect to absolute space would 
be detectable by means of the perceived spatial anisotropy of the propagation of 
electromagnetic and optical effects, without any reference to a mechanical 
ether. 

In the second place, even someone who was convinced for example by the 
criticisms of Mach [1883] that Newton’s argument was unsound, would have 
been able to reply with a direct appeal to experiment. The existence of a 
distinguished family of parallel hypercones in space-time is in Itself no more 
puzzling or implausible than the existence of a distinguished family of parallel 
hyperplanes. The existence of the hyperplanes of simultanelty could be 
empirically demonstrated (in classical space-time) by the independence of 
clock time under transport of clocks, and in the same way the existence of the 
light cones (in Maxwell space-time) could be empirically demonstrated by the 
independence of light velocity from the velocity of the source. (This argument 


56 Brent Mundy 


is used today, for example in Geroch [1978], pp. 55-7. The observations of 
double stars by de Sitter which provide its main empirical basis were however, 
not made until 1913; cf. Miller [1981], pp. 282-3.) From these two empirical 
propositions it follows geometrically that there is a distinguished frame of 
reference. 

Like the argument from wave propagation, therefore, these arguments from 
the structure of space-time seem to have no real force in favor of fields as 
opposed to retarded distant action, even within the 19th century context. In 
the contemporary context, of course, this is even more clearly the case. In 
Minkowski space-time there is no absolute simultaneity, and the geometrically 
natural types of distant action forces are precisely those which act along the 
light cones, as required. Moreover, the elimination of absolute simultaneity 
allows the light cone to exist without determining a privileged frame of 
reference, since owing to the modification of the metric structure all inertial 
frames measure the speed of light to be equal tn all directions (Mundy [1986]). 

A second potentially problematic kinematic aspect of retarded distant action 
(though again I do not know if this particular argument was formulated in the 
19th century) is the time-asymmetry associated with the choice of a retarded 
rather than an advanced distant action. This might be thought to favor fleld 
theory, insofar as the mechanical interpretation of field theory encourages us 
to think of electromagnetic action as a causal process in some intuitive sense, 
so that only retarded and not advanced effects should occur because causes 
follow their effects. In fact, however, this conclusion does not follow from the 
principles of the fleld theory alone. Nothing in the formal structure of field 
theory reflects this intuitive time-asymmetric conception of causation: 
Maxwell's equations also have advanced solutions, which correspond to 
advanced distant action, and the field theory alone provides no better basis for 
ruling out as unphysical the electromagnetic processes corresponding to such 
solutions than the distant action theory does for ruling out advanced distant 
action forces. The only difference is that in the distant action theory we have 
included this time asymmetry explicitly in the force law, whereas in the field 
theory it is imposed as an external co.:straint upon the chosen solutions to the 
time-symmetric field equations. There is no sense in which the field theory 
explains the empirical fact of retardation while the distant action theory does 
not? 


? In connection with the general problem of the source of physical time-asymmetry (e.g. Davies 
[1974], Gal-Or (1981]) tt makes a great deal of difference whether the time asymmetry of 
electromagnetism is placed in the fundamental laws as with the retarded distant action theory, 
or in the boundary conditions as with the field theory. However, in the present unsettled state of 
the discussion of physical time asymmetry this difference does not seem to bear in any strong 
way tn favor of one theory over the other. Moreover, distant action theories (e.g. that of 
Wheeler and Feynman [1945], [1949]) may be constructed In which the fundamental force 
law is time-symmetric (half retarded and half advanced) and the observed time-asymmetry 
derives from boundary conditions às with the fleld theory. 


Distant Action in Classical Electromagnetic Theory 57 


Like the kinematic argument from wave propagation, therefore, these 
considerations of space-time structure do not seem to provide evidence against 
retarded distant action, either in the 19th or the 20th century context. The line 
of scientific justification for the impact of Hertz's discoveries proposed at the 
beginning ofthis section thus seems tenable only on the basis of energetic rather 
than merely kinematic objections to retarded distant action. The remaining 
two sections of the paper are concerned with such lines of argument. 


6 ENERGY AND MOMENTUM OF RADIATION 


There are two main classes of argument to consider in respect of the energetic 
properties of electromagnetic processes, concerned respectively with the 
empirical facts of radiative transfer of energy and momentum (discussed in this 
section), and of conservation of energy and momentum (discussed in the next 
section). In both cases it has been claimed that the field theory yields a better 
account of the facts in question. 

In the modern field theory certain functions of the field F(x) having the 
dimensions of energy-momentum (or its density per unit volume of space or of 
space-time) are deflned, which are assumed to represent actual energy- 
momentum carried by the field and located near the point x. Using this 
assumption, the field theory aims to show that the total energy-momentum of 
field and particles remains constant, with electromagnetic interactions 
involving a spatio-temporally continuous transfer of this fixed quantity of 
energy-momentum between field and particles. There are perhaps some 
grounds for doubt whether this objective is fully attained by the existing 
classical formalisms. In this section and the next I will argue that even 
supposing this goal to be fully attained, there are not sufficient grounds for 
accepting fields as real. 

It is known empirically that light and other electromagnetic waves carry 
energy and momentum, and an electromagnetic theory must account for 
these energetic properties of the waves as well as the kinetic ones. The basic 
empirical data are the loss of energy (very easily observed) and of momentum 
(observable only with great difficulty) from a radiating source, and the 
corresponding gain of energy and momentum accompanying absorption of 
radiation. Since these changes are observed to occur even when the source or 
absorber is placed in a vacuum, they cannot be due to ordinary heat 
convection by the molecules at the atmosphere. 

I will consider the issue first from the viewpoint of the modern field theory. 
We may again appeal to the empirical equivalence of the two theories: insofar 
as the fleld theory is believed to account adequately for the observed energetic 
properties of electromagnetic waves in their interaction with matter, so also 
must the retarded distant action theory. We need merely translate the 


58 Brent Mundy 


mechanical energy and momentum from a material system by emission of 
radiation is represented in the standard theory by the inclusion of a radiation 
reaction term in the law of motion for charged particles, which represents an 
additional damping or resistance to acceleration beyond that given by the 
normal law of motion for neutral particles (cf. Barut [1964] or Rohrlich 
[1965]). In field language this term represents the necessity for an impressed 
force to communicate to the charged particle not only the direct time-rate of 
increment in kinematic energy-momentum corresponding to the actual 
acceleration produced, but also the increment in energy-momentum attri- 
buted (by the fleld theory) to the radiation fleld generated by the particle in 
consequence of that acceleration, as required for conservation. In the distant 
action theory we retain the same law of particle motion, but regard the 
damping term as a fundamental feature of the dynamics of charged particles. 
The inclusion of such a term is justified by the empirical fact of energy loss from 
accelerated charges, as with radiative cooling or the measurable damping of 
the oscillations in a radio antenna. Both theories will predict the same 
empirical rate of energy loss as a function of acceleration or of temperature. 
The difference is that on the distant action theory this energy simply vanishes 
in consequence of the damped law of motion obeyed by charged particles, 
while on the field theory it goes into the fleld.!? 

In the case of absorption, the communication of mechanical energy and 
momentum from a radiation field to a material body is directly accounted for 
by reference to the laws of charged particle motion: in addition to the emission 
of secondary radiation, a classical bound electron accelerated by an external 
fleld may communicate its acquired mechanical energy directly to other 
electrons by collision, thereby increasing the mechanical temperature of the 
body. This absorption mechanism is the same on either theory. The charged 
particles in the absorber will experience exactly the same acceleration on both 
theories, with the same resultant increase in temperature, since these 
processes follow from the law of motion alone. 


10 The damping term represents the rate at which energy-momentum is lost as radiation. Since on 
Marwell's theory the amount of radiation Is proportional to the acceleration of the particle, the 
damping term must be proportional to the time rate of change ofthe acceleration, and therefore 
contains a third derivative of the particle position, unlike the Newtontan equation of motion. In 
the standard field formulation this leads to problems in the physical interpretation of the 
equation of motion, including possible causal anomalies, (For philosophical discussion see 
Grünbaum [1976], Grünbaum and Jants [1977].) At least in part these problems derive from 
an attempt to construe the dynamical problem as analogous to the Newtonian one, with only 
the particle positions and velocities (together with the instantaneous state of the fleld) as inittal 
conditions; the problem for the field theory is how then to fix the initial accelerations which are 
also required in order to determine a unique solution to the third-order equation of particle 
motion. This problem does not arise in the distant action theory of Section 2 because the use of 
continuous segments of the particle histories as initial conditions ensures that there is a given 
initial acceleration as well as initial position and velocity when the equation of motion ts 
applied to predict future particle motion. The weight of this point as an argument in favor of the 
distant action theory will not be considered here. 


Distant Action in Classical Electromagnetic Theory 59 


On the field theory there is a continuous transmission of energy and 
momentum from the source through the field to the absorber. On the distant 
action theory the loss of energy at the source and the gain of energy at the 
absorber are independent effects of a common cause, namely the acceleration 
of the source particles, so there is no reason to expect the energy-momentum 
lost to equal that absorbed. The same is true for the field theory in general, 
since in a typical radiative process most of the radiation escapes to infinity. 
Disagreement might arise only in the case of complete absorption, where the 
source is completely surrounded by perfectly absorbing material. Here it will 
follow from the conservation theorem of field theory that the energy- 
momentum absorbed equals that lost by the source, in agreement with 
observation. On the distant action theory, by contrast, we do not expect this 
equality. 

This is a mistake, however, because of the empirical equivalence of the two 
theories. Suppose that there really is such a thing as a perfect absorber within 
the classical field theory. It must then follow from the field theory that the 
radiation from the enclosed source does not disturb the field outside the 
absorber, and hence it will follow from the conservation result of fleld theory 
that the net mechanical energy-momentum gained by the particles of the 
absorber in consequence of the primary and secondary radiative processes due 
to the enclosed source is equal to the net amount lost by the particles of the 
source due to radiation reaction. However, these two mechanical quantities 
are functions of the particle motions alone, and thus are the same in both 
theories, so that the distant action theory will also predict this observed 
equality. 

This result illustrates how the deductive apparatus of the field theory, 
including the formal attribution of energy-momentum to the field and the 
appeal to the formal conservation result of field theory, may serve as a device 
for proving theorems within the distant action theory. This is analogous to the 
use often made of the continuous scalar potential in Newtonian gravitational 
theory. The introduction of these formal flelds yields a particular kind of 
representation of the deductive content of the distant action theories, which 
may be useful for proving theorems. But since all of the results are 
consequences of the distant action theory alone, this deductive utility of the 
fleld representation provides no basis for acceptance of the flelds as actually 
existing fundamental entities. (Of course the fields exist as defined objects, as 
stressed in Section 3.) This deductive use of the fields as defined objects is 
analogous to the use made of coordinate lines in geometry. 

From a modern viewpolnt, therefore, the empirical facts of radiative energy 
and momentum transfer do not provide significant evidential support for fleld 
theory over retarded distant action theory. 

The 19th century picture is somewhat more complicated, since the theory of 
radiative energy transfer was in a continual state of development during the 
whole period, being left in a rather incomplete form by Maxwell and developed 


60 Brent Mundy 


further by his successors such as Poynting, Larmor, Abraham, and Lorentz, 
with essential contributions coming as late as the 1938 work of Dirac on 
classical radiation reaction. (See the brief history in Rohrlich [1965], Ch. 2; of 
course this development was strongly affected by special relativity and the 
introduction of the electron.) This makes it hard to determine just how much 
evidential support Maxwell’s theory derived from its incomplete account of 
such processes during the period of its initial acceptance. The essential point 
remains, however, and could perfectly well have been articulated within the 
19th century context: whatever account might ultimately be given of these 
radiative processes by the field theory, so long as the fields are generated by the 
source particles in accordance with Maxwell’s equations and affect the 
mechanical energy and momentum of material particles only through some 
equation of motion into which the fields enter, that same account will also be 
applicable within a retarded distant action theory postulating the same 
equation of motion. Therefore such considerations cannot possibly yield 
evidential grounds for rejecting retarded distant action in favor of fields. 

However, the picture of a continous flow of energy and momentum from 
source to absorber through the field has appeal of a non-evidential character. 
First, it satisfies one of the desiderata of the mechanist research program, that of 
providing an intuitively visualizable model of a continuous causal process 
mediating the electromagnetic action (though the process is no longer 
mechanical in any definite sense). Second, with the recognition of enery as a 
unifying concept in physical theory in the second half of the 19th century, it 
was appealing to think of energy as a kind of real and localized though non- 
material ‘stuff’ which retained its identity while flowing from place to place. 
The appeal of these considerations in the case of Poynting is noted for example 
by Hesse ([1955], pp. 351-2). Within the British research tradition such later 
views (culminating in the work of Larmor) may be seen as representing a 
modification of the original research program of ether mechanics, abandoning 
the strict requirement of a mechanical analysis of the process of propagation of 
electromagnetic effects, but still requiring continuous changes in an interven- 
ing medium, governed by local laws expressed in partial differential equations. 
(Doran [1975] contends that this late form of the ether program is essentially 
equivalent to field theory in the modern sense.) The field theory fits this later 
19th century research program as well as it did the earlier program of simple 
ether mechanics. As before, however, this consideration carries no evidential 
weight unless the program itself is supported by direct evidence of some kind. 
In this case the only new evidence is that relating to the conservation of energy. 
This is the topic of the next section. 


7 CONSERVATION OF ENERGY AND MOMENTUM 
During the 19th century Helmholtz (one of the co-discoverers of the 


Distant Action in Classical Electromagnetic Theory 6r 


conservation of energy) argued consistently and repeatedly against all forms of 
electromagnetic theory which could not be shown to satisfy conservation of 
energy and momentum. In particular, he used this type of argument to reject 
any form of dependence of electromagnetic action upon the velocity of the 
interacting particles, or upon the elapsed time, thus ruling out retarded distant 
action theories. This line of argument also occurs today; for example it is 
sometimes stated that the elimination of absolute simultaneity by special 
relativity makes distant action impossible because flelds are then necessary in 
order to ensure energy and momentum conservation. 

In evaluating the evidential force of such arguments it is necessary in the 
first place to consider the empirical evidence upon which the conservation 
` principles themselves are supposed to rest. In his famous 1847 paper on 
conservation Helmholtz takes as his starting point two principles, which he 
asserts to be equivalent. These are (a) the impossibility of perpetual motion, 
such as an isolated system of material particles which continually accelerates 
itself, gaining kinetic energy forever; and (b) that all forces between particles 
act instantaneously and along the line joining them, and are functions only of 
the distance between them (in modern terms, that the forces may be derived 
from a central potential). Evidently (a) has some empirical support, since we 
never observe the continual creation of energy from nothing, and many 
attempts to construct perpetual motion machines have been unsuccessful. The 
empirical character of (a) was stressed by Mach in his discussion of Helmholtz 
in 1871 (Hirosige [1976], pp.61—4), and was emphasized by Helmholtz 
himself in the notes added to the 1881 reprint of his article (Kahl translation, 
53—4). In contrast, (b) has no direct empirical support other than its truth in 
the case of Newtonian gravitational theory. Newton had argued from (a) to the 
law of action and reaction (assuming instantaneous propagation of forces); 
Helmholtz, in effect, attempts to extend Newton's line of argument by deriving 
additional constraints on the possible force laws from the same empirical 
proposition (a). 

The remainder of his paper develops the consequences of (b) in several 
branches of physical theory. The consequences include various instances of 
the conservation of energy through transformation among different forms, 
such as we today take as the empirical basis for the conservation principle. In 
Helmholtz's paper, however, these appear more as conjectural predictions 
from (b) than as hypothetico-deductive support for it; the main empirical basis 
for (b) itself seems to be its derivation from (a). 

Helmholtz seems to have been firmly convinced of the truth of (b), even in 
domains such as electromagnetic theory for which its correctness had little 
direct support. While remaining for many years an advocate of the distant 
action approach, Helmholtz conducted a sustained polemic against all forms of 
distant action theory which did not meet condition (b), either because of 
retardation as with the theory of C. Neumann, or because the distant forces 


62 Brent Mundy 


were dependent upon the velocities as well as the positions of the particles, as 
with the theories of Weber (Harman [1982] pp. 104-6, Woodruff [1968], 
D'Agostino [1975], p. 265).!! It appears that his arguments relied heavily 
upon attempts to show that these theories violated (a). This too suggests that 
Helmholtz saw the primary support for (b) as being its derivabllity from (a). 

In fact, (b) does not follow from (a). Ido not know of a detailed analysis ofthe 
mathematical arguments in Helmholtz's paper and of the degree to which they 
actually justify his claims; Laue [1950] (pp. 86-7) mentions Helmholz's error 
on this point but gives no discussion. Helmholtz’s 1881 appendices also 
mention some gaps in the 1847 derivation. Many authors (e.g. Dugas [1955], 
pp. 434-5) charitably interpret Helmholtz as having assumed (b), without 
mentioning the purported derivation of (b) from (a). From a retrospective 
viewpoint this is understandable, since he does mention (b) as an alternative to 
(a) as a possible basis for the developments of the 1847 paper, and (b) is closer 
to what we now think of as the conservation of energy than (a) is. But within 
the historical context this error may have played a substantial role in the 
acceptance of (b), for Helmholtz himself and those who were influenced by 
him 


The falsity of Helmholtz’s claim may be shown using the empirical 
equivalence of the retarded distant action and the field theories of electromag- 
netism. Suppose with Helmholtz that in consequence of the violation of (b) by 
the retarded distant action theory there will be some arrangement of 
interacting charges connected by rigid rods which will exhtbit a continual 
acceleration, in violation of (a). Since the laws of particle motion are the same 
for the fleld theory, the same motion will be predicted by the fleld theory for this 
system. But a continual acceleration in the sense intended by Helmholtz would 
involve an unbounded increase in the kinetic energy of the material part of the 
system alone, which by the conservation theorem of the field theory is 
impossible for an isolated system. Therefore this is also impossible on the 
retarded distant action theory. Helmholtz’s direct empirical argument against 
the possibility of retardation on grounds of violation of (a) is therefore 
unsound, and provides no evidential basis for his rejection of retarded distant 
action. 

The principle (a) is weaker than our modern conservation laws, since not all 


11 Today we tend to equate distant action theory with an ontology of particles and forces alone 
and fleld theory with an ontology of particles and states of empty space (fields or ether) alone. 
Helmholtz fatls to fit this modern dichotomy, since he accepted both ether and distant action: 
like Faraday and Maxwell he postulated physical states of polarisation of space, but unlike them 
he also postulated instantaneous distant action of charged particles upon one another and 
upon distant portions of ether, rather than relying (like Maxwell) entirely upon local action 
propagated continuously through the ether. In modern language Helmholtx postulates not a 
single ‘fleld of force’ but rather both fields and forces as independent entities, and moreover 
allows forces to act on flelds—something nearly incomprehensible from a modern viewpoint. 
The relations among these various types of theory are described in part B of the Introduction to 
Hertz [1892]. 


Distant Action in Classical Electromagnetic Theory 63 


violations of conservation lead to the possibility of perpetual motion. The 
modern laws (in their global form) simply assert that the total energy and 
momentum of an isolated system remain constant. A process which violates 
this law need not lead to the possibility of perpetual motion, since in the first 
place the violation may be a loss rather than a gain of total energy or 
momentum, and in the second place the process may not be repeatable at will. 
The conservation laws in their modern form do of course have empirical 
support, based on direct measurements of total energy or momentum before 
and after various kinds of physical transformation. We must therefore also 
consider a second and more modern form of evidential argument against 
retarded distant action theories, based upon their violation of conservation in 
the modern sense as supported by observations of this kind, without reference 
to the principle (a). 

The distant action theory violates the conservation laws in connection with 
radiative processes. Any radiating system not surrounded by a perfect absorber 
will continually lose energy by radiation damping, as described in Section 6. A 
radiant source rigidly attached to a parabolic mirror will accelerate itself in the 
direction of the mirror, due to the spatial asymmetry of the radiation damping 
on the charges in the reflecting surface, violating momentum conservation. 
Neither of these processes leads to perpetual motion: in both cases there is a loss 
rather than a gain of energy, and hence the process is not repeatable at will, 
since the available internal energy of the radiating source will eventually be 
exhausted. 

It is important to note that the distant action theory violates conservation 
only in connection with radiation. This follows from the conservation theorem 
of the fleld theory, since in any process not involving significant radiative loss 
the field theory will predict conservation for the mechanical energy and 
momentum of the material particles alone, and these quantities are the same 
for the distant action theory. 

This is important for two reasons. First, the empirical basis for the 
conservation laws consists of cases in which there is no appreciable radiative 
loss, since only the energy or momentum of material particles can be observed. 
Thus both theories account equally well for this evidence. Second, the linkage 
of violation of conservation to radiative processes allows the distant action 
theory to explain why conservation holds in some cases but not in others. The 
sole process which violates conservation is radiative loss, i.e. acceleration of 
charged particles not completely surrounded by perfectly absorbing matter. In 
these cases the violation may be directly observed. 

The difference between the two theories thus lies in their respective 
responses to the combination of observed conservation in non-radiative 
processes together with observed violation of conservation in radiative 
processes. The distant action theory accepts at face value the empirical fact of 
energy-momentum loss from accelerated charges, as represented by the 


64 Brent Mundy 


radiation damping term in the equation of charged particle motion, whereas 
the field theory postulates a new energy-bearing entity in order to preserve in 
radiative cases the conservation law which holds in the other cases. Our 
question is whether this provides a sufficient evidential basis for acceptance of 
the existence of the field. 

The situation may usefully be compared to that which arose in the case of 
beta decay (see Pais [1986], pp. 105-15, 303-20). Confronted with the 
empirical evidence of violation of the conservation laws in beta decay, one 
response was to suppose that the laws break down for this particular process 
(as for example it is today believed that a number of conservation laws valid for 
other types of process do indeed break down for beta decay and other weak 
interaction processes). A second response, due to Pauli, was to propose that a 
new hypothetical entity, the neutrino, takes part in the interaction and carries 
off the missing energy and momentum. 

The analogy is incomplete. First, the neutrino is at least the same kind of 
thing as other entities known to exist, even if it has some unusual properties. 
The field, on the other hand, ts an entity totally unlike anything else known to 
exist. (In this respect it resembles space-time itself, whose physical existence 
may be questioned on similar grounds.) Second, at the time when the neutrino 
was postulated (1930) there was definite empirical evidence (the Compton 
effect, the recoil experiments of Bothe and Geiger; cf. Mehra and Rechenberg 
[1982], pp. 607-13) for energy and momentum conservation in some 
individual atomic processes. On the other hand there is no direct evidence at a 
classical level for conservation in radiative processes involving single particles. 
For these two reasons the analogy with conservation in other types of process 
provided better grounds for the postulation of the neutrino than 1t does for the 
postulation of the classical electromagnetic field. 

However, even in the case of the neutrino it was clearly recognized that the 
argument from conservation alone does not constitute sufficient grounds for 
accepting the existence of the postulated particle (see Pais [1986], pp. 309-21). 
It is necessary to find additional empirical evidence which supports the new 
theory. The comparatively slight formal or aesthetic differences between the 
competing theories do not provide adequate sclentific grounds for accepting 
one and rejecting the other as long as no empirical evidence distinguishes 
between them. However, there is no such evidence in the electromagnetic 
case. The fleld theory is supported only by an analogical argument from 
conservation in other processes, an argument which is even weaker than that 
leading to the neutrino hypothesis. The analogical argument at most provides 
grounds for suspecting that there may be some additional component of the 
interaction process not represented in the retarded distant action theory. It 
does not justify the conclusion that such an additional component definitely 
exists and is in fact the field as described in the classical field theory. (Of course 
the suspected intermediary does indeed exist, but it is the photon, not the 
classical field.) 


- 


"we. 


^ 


Distant Action in Classical Electromagnetic Theory 65 


To conclude this section, it appears that neither Helmholtz’s argument from 
the impossibility of perpetual motion nor the analogical argument from 
conservation in non-radiative processes provide suffictent empirical justifica- 
tion for accepting the existence of classical fields and rejecting the retarded 
distant action theory.?? 


8 CONCLUSION 


This completes our survey of the empirical considerations normally thought to 
bear in favor of fields over distant action. The conclusion is that none of these 
considerations possess significant evidential force, either from a contemporary 
viewpoint or within the 19th century context. In brief, neither now nor in the 
19th century has the empirical evidence warranted acceptance of the physical 
existence of the classical electromagnetic field. 

There is still the historical problem of providing an alternative explanation 
for the triumph of field over distant action theory on the Continent. I would 
tentatively propose the following. We know that the theory of conservation of 
energy had a great impact in the later part of the century, and was in fact 
accepted as providing constraints applicable to electromagnetic theories in the 
manner advocated by Helmholtz (Wise [1981], pp. 301-3). What requires to 
be changed is only our interpretation of the nature of this influence. Broadly 
speaking, I would suggest that the idea of energy conservation should be seen 
as leading to a new research program on the Continent, which when combined 
with the existing program of distant action research tn electromagnetic theory 
led to a serious reduction in the flexibility of the latter. The two programs were 
not inconsistent, as shown by Helmholtz’s spirited pursuit of both for many 
years, but when taken together they eliminated the one type of distant action 
theory which could have accounted for the effects later observed by Hertz. On 
this account the essential factor in the elimination of distant action theory was 
not the empirical superiority of Maxwell’s field theory, as in the standard 
picture, but rather the non-empirical triumph of one research program over 


12 Regarding conservation it should also be noted, as pointed out by my eminent colleague Fritz 
Rohrlich, that the deductive apparatus (centered on Noether's theorem) connecting conserva- 
tion laws with variational princtples will also cease to apply for a retarded distant action theory. 
The ultimate reason for this is simply that the laws of particle motion of such a theory will not 
be derivable tn the standard ways from a Lagrangtan or Hamiltonian defined on the particle 
coordinates alone. I do not regard this as a scientifically significant argument against the 
theory, since we have no more direct empirical evidence that the true laws of nature must be 
derivable from a variational principle than we do that all quantities must obey a conservation 
law. Both statements hold true for non-radlative processes, are empirically violated in processes 
involving radiative loss, and may be rescued in an ad hoc fashion by postulating additional 
unobservable degrees of freedom for the system (Le. the fleld). Doing so ylelds a moderate formal 
simplification (applicability of variational principle, uniform derivability of strict conservation 
laws for all quantities using Noether's theorem), at the cost of postulating a completely new 
type of unobservable object, while yielding no new testable predictions concerning the 
observable quantities. Such an extension is surely no better supported than the original theory. 


66 Brent Mundy 


another on the Continent. Hertz’s demonstration of electromagnetic retarda- 
tlon made it impossible to adhere any longer both to the new conservationist 
program and to the older distant action program, but did not indicate which 
should be abandoned. To understand the triumph of field theory on the 
Continent we must understand the non-empirical factors which, at the time of 
Hertz's discoveries, made it seem more important to adhere to the conservatio- 
nist program than to the distant action program. Discussion of such factors is 
outside the scope of this paper. 


Department of Philosophy 
Syracuse University 


REFERENCES 


ANDERSON, J. L. [1967]: Principles of Relativity Physics. New York, Academic Press. 

Barut, A. O. [1964]: Electrodynamics and Classical Theory of Fields and Particles, 
Macmillan. Reprinted by Dover, New York, 1980. 

CANTOR, G. and Honce, M. (eds.) [1981]: Conceptions of Ether, Cambridge. 

D'Acostino, S. [1975]: ‘Hertz’s Researches on Electromagnetic Waves’, Historical 
Studies in the Physical Sciences, 6, 261-233. 

Davs, P. C. W. [1974]: The Physics of Time Asymmetry. Berkeley and Los Angeles, 
University of California Press. 

Doran, B. G. [1975]: ‘Origins and Consolidation of Field Theory in Nineteenth Century 
Britain', Historical Studies in the Physical Sciences, 6, 133-260. 

Ducas, R. [1955]: A History of Mechanics, Editions du Griffon. Switzerland, Neuchatel. 

ErNSTEIN, A. [1905]: ‘On the Electrodynamics of Moving Bodies’, English translation in 
Einstein et al, The Principle of Relativity, Methuen, 1923 and Dover reprint. 

Fær, H. [1980]: Science Without Numbers. Princeton, Princeton University Press. 

FIELD, H. [1985]: ‘Can We Dispense with Space-Time?', in Asquith, P. D. and Kitcher, P. 
(eds.), PSA 1984, vol. 2. East Lansing, Michigan, Philosophy of Science 
Association, p. 33-90. 

FRIEDMAN, M. [1983]: Foundations of Space-Time Theories. Princeton, N.J., Princeton 
University Press. 

GAL-OR, B. [1981]: Cosmology, Physics and Philosophy, Springer-Verlag. 

Gerocn, R. [1978]: General Relativity from A to B, Univ. of Chicago. 

GRÜNBAUM, A. [1976]: 'Is Preacceleration in Dirac's Electrodynamics a Case of 
Backwards Causation?’, Philosophy of Science, 43, 165-201. 

GRÜNBAUM, A. and Janis A. [1977]: ‘Is there Backward Causation in Classical 
Electrodynamics?’, Journal of Philosophy, 74, 475-82. 

HARMAN, P. M. [1982]: Energy, Force and Matter, Cambridge. 

HzLMHOLTZ, H. [1847]: ‘The Conservation of Force: a Physical Memoir’, English 
translation in R. Kahl (ed.), Selected Writings of Hermann von Helmholtz. Middleton, 
Conn., Wesleyan Univ. Press, 1971, 3-55. 

Hertz, H. [1892]: Untersuchungen ueber die Ausbrietung der Elektrischen Kraft, Johann 


d 


Distant Action in Classical Electromagnetic Theory 67 


Ambrosius Barth, Leipzig. English translation by D. E. Jones as Electric Waves, 
Macmillan and Co., 1893, reprinted by Dover, 1962. 

Hesse, M. [1955]: ‘Action at a Distance in Classical Physics’, Isis, 46, 337-53. 

Hesse, M. [1961]: Forces and Fields, reprinted by Littlefield, Adams and Co., Totowa, 
New Jersey, 1965. 

HraosiGE, T. [1976]: "The Ether Problem, the Mechanistic Worldview, and the Origins 
of the Theory of Relativity’, Historical Studies in the Physical Sciences, 7, 3-82. 

Laug, M. [1950]: History of Physics. New York, Academic Press. 

Lorentz, H. A. [1909]: The Theory of Electrons, second edition 1916, reprinted by Dover, 
New York, 1952. 

Maca, E. [1883]: The Science of Mechanics, seventh German edition 1912, English 
translation by T. J. McCormack, Open Court, La Salle, Ill., 1960. 

MEHRA, J. and RECHENBERG, H. [1982]: The Historical Development of Quantum Theory, 
vol. 1, part 2, Springer-Verlag. 

MiLLER, A. I. [1981]: Albert Einstein's Special Theory of Relativity. Reading, Massachu- 
setts, Addison Wesley. 

Monpy, B. [1986]: "The Physical Content of Minkowski Geometry’, British Journal for 
the Philosophy of Science, 37, 25-54. 

NAGEL, E. [1961]: The Structure of Science, reprinted by Hackett, Indianapolis, 1979. 

Pats, A. [1986]: Inward Bound, Oxford. 

RounLicH, F. [1965]: Classical Charged Particles. Reading, Massachusetts, Addison- 
Wesley. 

ROSENFELD, L. [1951]: Theory of Electrons, reprinted by Dover, New York, 1965. 

SALMON, W. [1984]: Scientific Explanation and the Causal Structure of the World. 
Princeton, Princeton Univ. Press. 

SCHAFFNER, K. (ed.) [1972]: Nineteenth-Century Ether Theories. Oxford, Pergamon. 

Srecet, D. M. [1981]: ‘Thomson, Maxwell, and the Universal Ether in Victorian 
Physics', in Cantor and Hodge (eds.) 1981, 239-68. 

SKLAR, L. [1974]: Space, Time and Space-Time, second edition 1976. Berkeley, University 
of California Press. 

Srem, H. [1970]: ‘On the Notion of Field in Newton, Maxwell, and Beyond’, in R. 
Stuewer [ed.], Historical and Philosophical Perspectives of Science. Minneapolis, Univ. 
of Minnesota Press. 264-310. 

TELLER, P. [ forthcoming]: ‘Space-Time as a Physical Quantity’, to appear in R. Kargon 
and P. Achinstein (eds.), Theoretical Physics in the 100 Years Since Kelvin's Baltimore 
Lectures. Cambridge, Mass., M.1.T. Press. 

WHEELER, J. and FEYNMAN, R. [1945]: ‘Interaction with the Absorber as the Mechanism 
of Radiation’, Reviews of Modern Physics, 17, 157-81. 

WHEELER, J. and FEYNMAN, R. [1949]: ‘Classical Electrodynamics in Terms of Direct 
Interparticle Action’, Reviews of Modern Physics 21, 425-33. 

WHITEHEAD, A. N. [1906]: ‘On Mathematical Concepts of the Material World’, 
Philosophical Transactions of the Royal Society. Reprinted in Northrop, F. S. C. and 
Gross M. (eds.), Alfred North Whitehead, An Anthology. New York, Macmillan, 
1961, 7-82. 

WHITTAKER, E. [1951]: A History of the Theories of Aether and Electricity, vol. 1: The 
Classical Theories, reprinted by Harper, New York, 1960. 


68 Brent Mundy 


Wisz, M. N. [1981]: ‘German Concepts of Force, Energy, and the Electromagnetic Ether: 
1845-1880’, in Cantor and Hodge (eds.) 1981, 269-307. 

Woonrur?, A. E. [1962]: ‘Action at a Distance in Nineteenth Century Electrodynamics’, 
Isis, 53, 438-59. 

WOoDRUET, A. E. [1968]: ‘The Contributions of Hermann von Helmholtz to Electrody- 
namics’, Isis, 59, 300-11. 


X 


Brit. J. Phil. Sci. 40 (1989), 69-76 Printed in Great Britain 


Simultaneity, Conventionality and 
Existence* 


VESSELIN PETKOV 


ABSTRACT 


The present paper pursues two aims. First to show that the experiment proposed by 
Stolakis [1986] does not lead to absolute synchronization in a single frame of 
reference and therefore also to the measurement of one-way velocity of light. 
Second, by consecutively considering the problems of the conventionality of 
simultaneity and of existence to show that the simultaneity of distant events can be 
a matter of convention only in a four-dimensional world. 


1 Introduction 
2 Has the Conventionality of Simultaneity Been Refuted? 
3 On the Essence of the Conventionality of Simultaneity 


I INTRODUCTION 


There have been three changes in the history of the idea of stmultaneity. It has 
changed for the first time in the 17th century after Remer had shown that light 
propagated with extreme velocity. It was then clearly realized that the events 
we observe simultaneously at a given moment of time have in actual fact taken 
place at different previous moments. The idea of simultaneity was changed for 
the second time when the theory of relativity showed that simultaneity was 
not absolute, that there is meaning in speaking of the simultaneity of any 
events only as regards a given reference frame (or observer). The third change 
of view on simultaneity is connected with the elucidation of the fact that even 
as regards a single reference system the definition of the simultaneity of events 
is not absolute but is a matter of convention. The present paper is devoted 
precisely to elucidating the essence of the conventtonality of simultaneity. 
The problem of the conventionality of simultaneity already stemmed from 
Einstein and Reichenbach and chiefly boils down to the following. Two distant 
clocks at the respective points A and B in a single frame of reference must be 
synchronized. At moment t^ a light signal is emitted from point A, is reflected 


* I am grateful to the anonymous referees for their useful suggestions on an earlier version of this 
paper. 


70 Vesselin Petkov 


on point B at moment t$ and returns to point A at moment t4. The times t$ and 
tå are measured by the clock at point A, while the time t} is measured at point 
B. The question is: is there an objective criterion on the basis of which it can be 
determined which moment t^ of the interval (t4, t$) is simultaneous with the 
moment of reflection of the light signal at point B, i.e. with the moment t4? The 
different possibilities of this are expressed in the formula: 


t t5 o e(t$ t1), 


in which 0«2« 1. However, it appears that in the attempt to establish this 
moment, i.e. in the attempt to determine the ‘real’ value of e, a logical circle is 
obtained (Reichenbach [1958]): if we try to synchronize the clocks at points A 
and B by light signals we must know the one-way velocity of light; but in order 
to measure it the clocks should have been synchronized beforehand. The 
situation in the attempt to synchronize the clocks by slow transport of a third 
clock is analogous. 

This situation shows that it is not possible to determine the value of e by 
means of a physical experiment, Le. it is impossible to establish which events 
are simultaneous at a given moment of time even as regards a single system of 
reference. From this the conclusion is drawn that determining simultaneous 
events is a matter of convention, from which it follows in particular that the 
answer to the question: are the back and forth velocities of light one and the 
same, also proves to be conventional. The latter conclusion appears to arouse 
the strongest instinctive resistance, because it contradicts the statement 
considered as an obvious one that in reality light travels at an exactly determined 
velocity. If this is so, we should be able to measure its velocity in one way. This 
dissatisfaction with the conventionality of simultaneity probably explains why 
ideas on measuring one-way velocity of light are persistently suggested. 


2 HAS THE CONVENTIONALITY OF SIMULTANEITY 
BEEN REFUTED? 


One ofthe most recent suggestions for an experimental solution of the problem 
of the conventionality of simultaneity was made by Stolakis [1986]. The aim 
pursued crossed the one factually achieved. The author's efforts were directed 
to proposing an experimental way of measuring one-way velocity of light, thus 
showing the inconsistency of the conventionality thesis. Actually, however, he 
proposes an experiment which had no relation to the conventionality of 
simultaneity but could serve to discover a possible anisotropy of spacetime in a 
space-like direction. 

The essence of the experiment proposed by Stolakis would be revealed by a 
more thorough analysis of the standard schema of synchronizing the clocks A 
and B, distant in space, set forth in Part 1. Figure 1-a shows the world lines A 
and Bofthetwo clocks (at rest as regards one another) and the world line ofthe 


Simultaneity, Conventionality and Existence 7X 


tê t? p t 
M Ü M 
> peal 
tS tt 
A B A B 
(a) (b) 


Figure 1 





light ray propagating between them. In standard synchronization (e= 1/2) 
event R (reflecting the light ray from clock B) is simultaneous with event M. In 
this case space is orthogonal to the world lines of the clocks and the velocity of 
the back and forth light signal is one and the same. In any other choice of e (in 
the limits O « e « 1) space is not orthogonal to the world lines of clocks A and B 
and the velocity ofthe back and forth light ray is different. Therefore velocity of 
light in a straight direction is different from velocity in the opposite direction, 
owing to the fact that space is not orthogonal to A and B. This conclusion is 
only correct, however, if a premise (usually understood) is fulfilled, namely 
that space-time is isotropic. In this case (depicted in Figure 1~a) the slant (in 
relation to A and B) of the world line of the light ray propagating from clock A 
to B is equal to the slant of the world line of the light ray from B to A, Le. the 
angle a, is equal to the angle fi. But if space-time is anisotropic in a space-like 
direction (see Figure 1-b) the slant (in relation to A and B) of the world line of 
the light ray, propagating from clock A to clock B can be larger than the slant 
of the world line of the light ray, propagating from B to A, i.e. the angle a, can 
be larger than the angle fi, (this anisotropy of space-time would appear as a 
deformation of the light cone). In this case the orthogony of space in relation to 
A and B does not ensure equality of light velocity in a bacK-and forth direction. 
Therefore there is no simple connection between the equation of light velocity 
in a back and forth direction and the orthogony of space in relation to the 


72 Vesselin Petkov 


world lines of the clocks, i.e. in relation to the time axis of the system of 
reference, in relation to which clocks A and B are at rest. 

In both cases depicted in Figures 1-a and 1-b, we can arbitrarily choose the 
value of e (of course in the limits of 0 «&« 1), ie. we can arbitrarily define 
which moment of the interval (t$, t$) is simultaneous with t}. Thus, in both 
cases we would obtain arbitrary values of back and forth velocity of light (as 
Winnie [1970] showed and all relative velocities will also prove to be 
arbitrary). In other words, we can arbitrarily (but nevertheless within definite 
limits, ensuing from the restriction 0 «s « 1) choose the angle between space 
and the world lines A and B. This arbitrariness shows that defining the 
simultaneity of events M and R is a matter of convention and ensues from the 
fact that not a single moment in the interval (t4, t4) is objectively priviledged so 
that it can be considered simultaneous with the moment t8. The matter of the 
essence of this objective privilege will be discussed in Part 3. But even if we 
accept that any moment of the interval (t^, t4) is simultaneous with the 
moment t}, the question of the orthogony of space in relation to the world lines 
A and B (Le. the matter of the isotropy of space-time in a space-like direction) 
remains open. Let us suppose that event M is simultaneous with event R, i.e. 
with the moment t3. Then in both cases depicted on Figures 1-a and 1-b it 
appears that back and forth velocity of light is one and the same, but owing to 
the isotropy of space-time, on Figure 1-a space is orthogonal to the world lines 
A and B, while, owing to the anisotropy of space-time in a space-like direction, 
on Figure l-b space is not orthogonal to A and B. These two cases are 
indiscernible experimentally. The matter of the isotropy of space-time cannot 
be solved by a more complicated experiment (shown on Figure 1-c) in which 
light signals are emitted from a clock in opposite directions to two other equally 
distant clocks. After being reflected from the endmost clocks the light signals 
return to the central clock. Regardless of whether the light signals propagate in 
isotropic spacetime (in this case their world lines are depicted with dotted lines) 
or in anisotropic space-time, they arrive simultaneously at the central clock. 

Nevertheless, the question might be asked: is there any way of delimiting the 
cases of isotropic and anisotropic space-time experimentally? The value of the 
Stolakis experiment proposed consists namely in the fact that it attempts to 
answer this question. The experiment is depicted in Figure 2. Figure 2-a shows 
the case when space-time is isotropic, while on Figure 2-b we see the case of 
anisotropic space-time. The smaller slant of the world lines of the light signals 
to events M and N (lying on the world lines of the points at which the light rays 
have been reflected) in Figure 2-a, compared with the slant of their world lines 
after M and N, reflects the fact that up to the moment when the light signals are 
reflected they have moved at a lesser velocity (owing to their propagation in a 
medium with a refractive index of n> 1). In this case, owing to the isotrophy of 
space-time, the light signals arrive simultaneously with event C4. However, this 
is no proof that back and forth velocity of light is one and the same, because we 


ko 


Simultaneity, Conventionality and Existence 73 





C; G 
G 
M N M 
N 
C 
1 G 


(a) (b) 
Figure 2 ` 


can again choose to have space not orthogonal to the time axis which will lead 
to different values of back and forth velocity. We have no criterion (and the 
Stolakis experiment does not give us one) which would force us to choose space 
in such a way as to have events M and N occur simultaneously in it. Only if we 
consider events M and N as simultaneous (Le. as being privileged in comparison 
with the remaining events on the world lines on which events M and N lie) 
could it be asserted that the one-way velocity of light has been found. The 
simultaneous arrival of light rays at the initial point from which they were 
emitted is proof alone of the fact that space-time is isotropic in a space-like 
direction. 

Figure 2-b depicts the Stolakis experiment tn a case of anisotropic space- 
time in a space-like direction. In the case, the light signals emitted at event C, 
and respectively reflected in events M and N will not arrive simultaneously at 
their point of emission. The reason for the non-simultaneous arrival ofthe light 
signals (i.e. the appearance of the time difference tc,c,) is, however, due solely 
to the anisotrophy of space-time. Even now the definition of back and forth 
velocity of light is a matter of convention, because once again the criterion for 
the simultaneity of events M and N has not been discovered. The only fact 
which can be established by the Stolakis experiment is whether space-time is 
anisotropic in a space-like direction. 


74 Vesselin Petkov 


Yet even if the problem of the isotropy of space-time Is solved, the problem of 
the conventionality of simultanelty remains untouched. Thus the Stolakis 
experiment has no relation to the problem of the conventionality of 
simultaneity. The analysis of this experiment shows once again the impossibi- 
lity in principle of measuring one-way velocity of light and therefore of rejecting 
the thesis of the conventionality of simultanelty. 


3 ON THE ESSENCE OF THE CONVENTIONALITY OF SIMULTANEITY 


The discussion on the problem of the conventionality of simultaneity would 
hardly have continued so long if it had been considered in close connection 
with the problem of existence. The joint consideration of both problems makes 
it possible (1) categorically to establish that the definition of simultaneous 
events is a matter of convention, and (2) to elucidate the essence of this 
convention. So that thís joint consideration may be realized, from now 
onwards simultaneously at the present moment of time will be implicit in the use 
of the term simultaneity. 

According to the classical (pre-relativistic) view of reality only the present 
exists, Le. only the constantly changing three-dimensional world at the 
moment 'now'. But bearing in mind the fact that the present (the three- 
dimensional world) is a set of simultaneous events at the present moment (i.e. 
the set of materlal objects existing simultaneously at the moment ‘now’) it 
follows that the stmultaneity of events is an objective fact. 

Therefore, the objectivity of simultaneity is expressed in the circumstance 
that simultaneous events occur simultaneously, i.e. that the objects with which 
these events take place exist simultaneously. It is precisely the existence at the 
moment 'now' which objectively privileges one class of events. Asis apparent a 
conventlonality of simultaneity in this case obviously leads to an unacceptable 
conclusion regarding a conventionality also connected with what exists (R. 
Weingard [1972] says that if two events are real, their simultaneity cannot be 
a matter of convention). Therefore, according to the classical view of reality, 
simultaneous events at the present moment of time (i.e. present events) are 
objectively privileged in comparison with past and future events (since only 
present events are considered as existing), which means that simultaneity is 
not conventional. 

However, things look quite different from the point of view of the theory of 
relativity. The problem is additionally complicated by the fact that the problem 
of the dimensionality of the world has not been convincingly solved to this day. 
If we allow that even according to the special theory of relativity reality is a 
three-dimensional world (a three-dimensional space-like slice of the Min- 
kowski world—the present), the above-mentioned unacceptable conventiona- 
lity as regards what exists follows from the conventionality of simultaneity. 
Therefore reality cannot be a three-dimensional world if the deflnition of 


«S 


Simultaneity, Conventionality and Existence 75 


simultaneity is a matter of convention, because if it could, it follows that it will 
depend on our own will which three-dimensional space-like slice of the 
Minkowski world we are to consider as reality. This unacceptable conclusion is 
reached on the basis of the premises of the conventionality of simultaneity and 
the three-dimensionality of reality. It can only be avoided if we give up one of 
the premises. The impossibility of demonstrating by means of a physical 
experiment the privileged state (existence) of only one set of events which are 
to be considered as simultaneous at the present moment (i.e. the impossibility of 
refuting the conventionality thesis of simultaneity), shows that the premise of 
the three-dimensionality of the world should be abandoned. This conclusion 
also follows directly from the relativity of simultaneity. The fact that the 
observers in relative motion have different classes of simultaneous events 
shows that not a single class is objecttvely privileged. Therefore all events are 
equally real (Putnam [1967]), which means that it really is a question of 
convention which events should be considered as simultaneous for a given 
observer. In the contrary case—if any observer has objective grounds to 
examine as simultaneous exactly defined events—this would mean that, being 
objectively privileged, these events would be simultaneous for all observers in 
relative motion. In such a case, however, the relativity of simultaneity proves 
to be impossible. That is why it (the relativity of simultaneity) unambiguously 
shows that the simultaneity of distant events is conventional (from which it . 
follows that the world cannot be three-dimensional). If we deny the view that 
the world is three-dimensional and consider the Minkowski world as a 
mathematical model of a real four-dimensional world, conventionality in the 
choice of a three-dimensional space-like slice of this world becomes trivial, 
because all slices are equally existent and the matter only concerns the 
convenience of which slice is to be examined as the present. In other words, 
since all events in a four-dimensional world exist in the same way the 
conventionality of simultaneity does not also lead to conventionality of 
existence, but is connected only with the description of the four-dimensional 
world in our habitual ‘three-dimensional language’. The elucidation of this 
' situation reveals the profound essence of the conventionality of simultaneity: 
we cannot choose one class of events among the events of the Minkowski 
world simply because there is no such a class of objectively priviledged events 
(owing to the equal existence of all events), which we might examine as 
stmultaneous at the moment ‘now’. 

It can now be said that the logical circle obtained in an attempt to imagine 
an experiment enabling us to define which events are simultaneous at a given 
moment of time, or to establish the one-way velocity of light, convincingly 
shows that we have tried, on the basis of an erroneous view of the 
dimensionality of the world to discover the objective content of concepts (as 
simultaneity and velocity), for which it appears that they have no such content 
according to a more adequate view of reality. This explains why the 


76 Vesselin Petkov 


conventionality of simultaneity does not presuppose some kind of ‘agreements’ 
concerning physical magnitudes as to which we have been intuitively 
convinced that they have an exactly defined objective content. And, indeed, the 
concepts simultaneity and velocity have no adequate content, because among 
the equally existing events of the four-dimensional world there is no set of events 
which are privileged as being simultaneous (at the present moment). We can 
speak of simultaneity and velocity only after the stratification of the four- 
dimensional world, i.e. of space-time, into space and time. However, this 
stratification has no objective basis (space-time is not really divided into space 
and time precisely because of the equal existence of all events), but is the result 
solely of a description of the four-dimensional world in 'three-dimensional 
language'. Owing to this reason it is obvious that it is indeed a matter of 
convention as to how to stratify space-time into space and time (of course, 
within the framework ofthe requirement O « e « 1). Only in this case, when we 
have chosen that space should be orthogonal to the time axis (i.e. with e— 1/2 
and with isotropic space-time) the back and forth velocity of light is one and 
the same. Depending on the manner in which we shall stratify space-time into 
space and time (i.e. depending on the choice of e) different values of back and 
forth velocity of light will be obtained and the experiment will always confirm 
(in the supposed isotropy of space-time) the theoretically foretold result, simply 
because in processing the experimental data the premise on the choice of e is 
essentially made use of. That is why all attempts to measure one-way velocity 
of light are doomed to failure beforehand. 


Institute of Philosophy, Sofla 


REFERENCES 


PUTNAM, H. [1967]: "Time and Physical Geometry’, Journal of Philosophy, 64, pp. 240— 
7. 

REICHENBACH, H, [1957]: The Philosophy of Space and Time. Dover. 

SrorAKIs, G. [1986]: ‘Against Conventionalism in Physics’, British Journal for the 
Philosophy of Science, 37, pp. 229-32. 

WEINGARD, R. [1972]: ‘Relativity and the Reality of Past and Future Events’, British 
Journal for the Philosophy of Science, 23, pp. 119-21. 

WN, J. [1970]: ‘Special Relativity Without One-Way Velocity Assumptions’, 
Philosophy of Science, 37, pp. 81-99; 223-38. 


Brit. J. Phil. Sct. 40 (1989), 77-82 Printed in Great Britain 


‘Taking Mathematics Seriously? 


JOSEPH ZYCINSKI 


Michael Ruse, in concluding his defence of the Darwinian approach to 
epistemology, categorically declares: ‘A century and a quarter after the 
appearance of On the Origin of Species, the time has surely come to take Darwin 
seriously’ (Ruse [1986], p. 279). This declaration seems to suggest that: (1) for 
125 years Darwin was not treated seriously, and (2) to be a serious Darwinist 
one must necessarily apply to epistemology sociobiological principles defined 
by E. O. Wilson and his supporters. The problem arises, however, whether 
in the socioblology-laden philosophy of science one could treat seriously 
mathematics and explain its specific role in science and in human culture. This 
problem is important because in earlier sociobiological interpretations mathe- 
matics was presented as a result of genetic conditioning useful in the 
evolutionary struggle for existence. The significance of similar interpretations 
was softened by the fact that their authors self-critically included themselves in 
mathematical semi-literates (Wilson [1984], p. 64). The most intriguing issue 
dealt, nonetheless, with the question: May the Wilsonian metaphor of human 
culture being held on a leash by genes be applied to ‘explain’ genetically the 
content of mathematical theorems? Introducing new vague metaphors and 
using terminology that is far from precise, Ruse seems to answer this question 
positively. His expositlon of the relationship between mathematics and 
sociobiological versions of evolutionary theory appears different from any 
former standpoint in the debates on the foundations of mathematics. I would 
like to dedicate a few remarks to the nature of these differences. 

Admitting that he cannot offer a fully developed philosophy of mathematics, 
Ruse deflnitely denies the traditional conception of objective mathematical 
truths and maintains that the apparent objectivity of mathematics is illusory 
(p. 173). The essence of the so-called Darwinian approach to mathematics is 
expressed in the thesis that human individuals themselves objectify the truths 
of logic and mathematics, because personal certainty gives us a selective 
advantage in our struggle for existence. "The human who believes that “2 4- 2" 
really equals "4" ts going to act upon it without question . . . And this will give 
them a selective advantage over those who question the basic premises of logic 
and mathematics . . .' (p. 172). 

Nobody would question that knowledge of mathematics and logic is indeed 
useful for our species. Selective advantages, however, can result from knowing 
not only elementary arithmetic but also such trivial facts as ‘Parts is a capital of 


78 Joseph Zycinski 


France’ or ‘Nobody can travel faster than light’. Awareness of these facts can 
give the selective advantage to a travelling representative of the species homo 
sapiens; this awareness cannot be, nonetheless, treated as the ultimate 
criterion of truth of the facts in question. Subjective conviction of the certainty 
of mathematics may be evolutionarily useful, but an analogous psychological 
feeling of certainty can be generated by ideological dogmas or pseudo-sclentific 
speculations. The sociobitological philosophy of mathematics does not explain 
what the difference is between the objective certainty of mathematical 
theorems and the subjective certainty shared by supporters of Nazi anthro- 
pology, Velikovsky's astronomy or Lysenko's biology. It does not explain why 
in elementary arithmetic we have no mathematical Lysenkos who, preserving 
the standard meaning of the employed terms, would argue that ‘2+2=5’. 
Such 'nonstandard arithmetic' would certainly provide evolutionary advan- 
tages for groups demonstrating their territorial or financial supremacy in the 
struggle for existence. What kind of genetic conditioning would be required to 
determine our propensity to approve such unusual arithmetic? 

The orthodox sociobiologist will probably answer that '2--2—4' always 
because the so-called epigenetic rules necessarily and universally determine 
this 'fact' as an inevitable precondition of evolutionary development. The main 
difficulty emerges, however, when we attempt to define unambiguously the 
semantic content of the expression 'epigenetic rules determine'. A precision 
may scarcely be found in claims that mathematics 'is built around principles 
informed and constrained by the epigenetic rules' (Ruse [1986], p. 170), that 
its systems are ‘grounded in’ or ‘biologically based on’ these rules (p. 170), that 
the rules in question ‘influence’ our thought and ‘shape’ our culture 
(Lumsden, Wilson, [1983a], pp. 53, 1), or even that ‘the mind and culture... 
sprung from genetics’ (Lumsden, Wilson, [1983b], p. 2). Dependent on 
meaning ascribed to the terms ‘grounded’, ‘based’, ‘constrains’, ‘influences’, 
we may obtain either a trivial or an evidently irrational epistemology when we 
try to develop the sketched proposals. 

In the socioblological description of the role of epigenetic rules, one finds 
examples of babies preferring sugar to water and of our reception of spectral 
continuum as four colours. Nobody explains, however, how to refer to these 
rules in order to assess Cantor’s continuum hypothesis or to define how the 
mathematical four-colour theorem gives us an evolutionary advantage. 
Except for eccentric defenders of the theory of tabula rasa, nobody would 
question today that our biology in a sense ‘influences’ and ‘constrains’ our 
mathematics and that the mathematical culture also is held on a leash by 
genes. If we had the brain of a chimpanzee, we certainly could not discuss 
recursive functions and the axiom of choice. The most important question 
remaining, however, is the question of how long the genetic leash is and how 
strong ts its impact on the content of our behaviour. A vision of mathematics 
whose objective content is thoroughly determined by the genetic leash may be 


Taking Mathematics Serlously 79 


interesting for science fiction but cannot explain elementary mathematical 
facts. 

Ruse himself distinguishes different roles of various mathematical theorems. 
He seems to combine sociobiology and intuitionism when he argues that there 
is a set of epigenetically determined simple mathematical principles and rules 
which appear to our consciousness as self-evident. Expressing his implicit 
support for modified metamathematical formalism, he argues that after 
adopting the elementary set of genetically determined truths, logicians and 
mathematicians demonstrate their creative fantasy by developing new 
systems of fantastic games. More advanced mathematics is just ‘an epipheno- 
menon on a biologically based set of simple statements and rules’ (p. 170). This 
approach reminds one of Kronecker’s thesis that the integers were made by 
God, and everything else is the work of man. One difference is that Ruse and 
Wilson replaced God by epigenetic rules, and another is that nobody knows 
precisely which mathematical axtoms are determined by these rules. The 
optimistic belief that it is enough to have a set of ‘simple statements and rules’ 
to construct all of mathematics turned out to be definitely too optimistic when 
at the beginning of our century basic divergences of opinions emerged in the 
quest for fundamental axioms to which mathematics was supposed to be 
reduced. We could expect that, when repeating the metamathematical thesis 
which inspired many unsuccessful programs in the past, Ruse will define the 
means of overcoming, in his naturalist epistemology, the well-known 
difficulties faced by Russell and Hilbert. He does not and never mentions the 
difficulties. . 

Paying no special attention to metamathematical discoveries of our century, 
Ruse tries to apply his scheme to explain the genesis of non-Euclidean 
geometrles. He suggests that the Euclidean fifth postulate, though suspect, 
remained unquestioned till the 19th century because it was relatively 
unimportant in the evolutionary struggle for existence. When taking part in 
this struggle, our Australopicine ancestors had to 'know that a straight line is 
the quickest way from A to B. But who cares about whether or not parallel lines 
never meet?' (p. 171). One may doubt whether or not the concept of parallel 
lines was indeed pragmatically useless for primitive societies. One may also ask 
why these socleties were interested in theoretical problems that deal, for 
instance, with the equality of all right angles. The fourth Euclidean postulate 
asserts that all right angles are equal to one another; it is interesting that this 
postulate prompted no objections from mathematicians, even though its 
pragmatic utility seems rather dubious and Australopicine rather should not 
have been disturbed with its content. 

Perhaps sociobiology may be used to explain the genesis of some concepts 
in pre-Euclidean mathematics. Euclid's Elements with its basic notions of 
indivisible points, breadthless ('aplates) lines and infinite surfaces are, however, 
as distant from evolutionary adaptive advantages as Wilson's evolutionary... 


V Ug N, 
j ET 
Ne 
S v 
E (prea ]: 


80 Joseph Zycinski 


epistemology is distant from contemporary mathematics. The most important 
questions that attract the attention of successors of Hilbert and Gódel are, in 
this epistemology, elther unnoticed or commented with metaphorical maxims. 
Ruse’s epistemological comments on significance of Gódel's incompleteness 
theorem may well illustrate this procedure. Gódel's discovery cannot be 
regarded as intuitively self-evident because it shocked the mathematical world. 
It is also hard to treat it as a result of creative playing with symbols and 
conventions, because there exists no mathematical game in which our 
creativity could be expressed in defining a system which is simultaneously: 
(1) isomorphic with the Principia Mathematica, (2) consistent, (3) complete. 

The fact of the incompleteness of arithmetic, regardless of its possible 
significance for our evolutionary development, appears independent of both 
our conventions and our intuitions; the notion of mathematical objectivity is 
especially conspicuous when we consider the amazing results of Gódel's 
discovery. When many argue that this discovery belongs to the greatest 
discoveries of humankind (Cf. Campbell [1984], p. 262), and that Gódel's 
‘work influenced ...all further thought about the foundations of mathe- 
matics’ (Gödel [1986], p. 1), Ruse in his sociobiological epistemology merely 
notices that ‘worries about failures in completeness are vestiges of Platonic/ 
theistic thinking' (p. 170, n. 4). The content of the disturbing theorem 
provides, nevertheless, an especially suggestive counter-example to the 
soclobiological philosophy of mathematics. The incompleteness of logical 
systems appears as an objective but counter-intuitive fact irrelevant for 
biological selection. The easlest way to eliminate questions about the objective 
nature of this incompleteness is to impute Platonic or religious influences to 
one's philosophical opponents. 

If sociobiological epistemology ts to be treated seriously and not reduced to 
ambiguous metaphorical poetry, one has to determine precisely how to 
understand the link between human genetic conditioning and the content of 
mathematical theorems. The formulae about the genetic leash or biological 
imperatives can be reconciled with mutually inconsistent standpoints in the 
philosophy of mathematics. 

Various sociobiological comments about the essence of mathematical 
knowledge may be consistent at least with three essentially different 
philosophical standpoints: 


1. Genetic determinants are important both for psychology of'mathematical 
discovery and for philosophical preferences that are expressed in assessing the 
role of the law of the excluded middle, the axiom of choice, etc. 

2. Our biology determines not only the forms of the growth of mathematics 
but also the content of basic mathematical theorems. These theorems possess, 
however, an objective validity independent of their significance for human 
evolution; the convergence of objective truths and genetic preconditions ts 


Taking Mathematics Seriously 8r 


manifested in the fact that both logical inferences and biologically determined 
human intuitions lead to the same set of statements which turn out to be 
important for the survival of our specles. 

3. The traditional notion of mathematical truth is a result of metascientific 
illusions discredited by sociobtological conceptions of the mythopoeic func- 
tions of science. The ultimate criterion of truth in mathematics is constituted 
by epigenetic rules that aim at adaptive advantages of the human species. 
Mathematics so understood provides an effective means of action in everyday 
experience that creates an illusion of its objective truth. 


The first of the indicated standpoints seems justified but scarcely original. It 
is impossible to question the role of biological factors and personal preferences 
in the context of mathematical discovery. The second viewpoint could be 
consistent with the traditional epistemology of mathematics, but it requires 
empirical evidence to confirm the existence of real convergence between the 
objective and the btological-intuitive component of mathematical cognition. 
Without such evidence this standpoint would remain intellectually attractive 
but substantively groundless. The third position introduces genetic deter- 
minism in the place of the rational reflection of mankind. In this approach, the 
creativity of genes is not restrained to human intellectual propensities but to 
the very content of mathematical theorems. ‘Vanity and ignorance alone can 
support the claim that the human reason has a privileged status.’ (Ruse 
[1986], p. 206.) 

After denying our intellectual privileges, one must treat mathematical 
theorems in pragmatic-aesthetical categories and regard even basic principles 
of theoretical physics, for instance Heisenberg’s Uncertainty Principle, merely 
as pragmatic devices introduced to bar ‘the asking of awkward . . . questions’ 
(Ruse [1986], p. 157). Even in this approach, one may attempt to defend a 
diluted version of objective mathematical truth since it is easy to verify that 
'2--2' objectively gives ‘4’ when applied to adding stones, computers or 
intellectuals. The key difference is, however, that the ultimate criterion of 
similar truths is to be constituted by epigenetic rules and not by the objective 
reality of mathematical relations. 

Adopting the radical third version of the sociobiological philosophy of 
mathematics results in a simple and aesthetically appealing philosophy of 
visionary reductionism. Its supporters, when praising the mertts of this 
philosophy, repeat many of the declarations of Ernst Mach and other 19th 
century reductionists. One has to remember, however, that both the atomistic 
theory and Einstein's theory of relativity were called into question from the 
standpoint of Machian philosophy. Both of them were excluded, as being too 
complicated, from the simple vision proposed by Mach. The Einstein-Planck 
revolution in physics pointed out definitely that references to formal simplicity, 
unlimited extrapolations and dogmatic reductionism may be sometimes closer 
to rhetoric than to heuristically valuable research programs. 


82 Joseph Zycinski 


Ruse himself gives examples (Ruse [1979], p. 264) of how unrestrained 
extrapolations of Darwin's principles led to pseudoscientific interpretations in 
the 19th century. This uncritical attitude resulted in naive economic and 
theological theories in which laissez-faire economics was labelled ‘social 
Darwinism’ and natural selection was presented as a form of supernatural 
predestination by God—the Great Selector. Considering similar examples, one 
must be critical about new versions of the unlimited extrapolating of 
Darwinian principles. Perhaps the best method for treating both Darwin and 
mathematics seriously is to limit applying these principles to the domains 
of biology and anthropology unless empirical evidence confirms powerful 
extrapolations. 


Pontifical Academy of Cracow, Poland 


REFERENCES 


CAMPBELL, D. M. and Hicerns, J. C. [1984]: Mathematics: People, Problems, Results. Vol. 
2, Belmont, Wadsworth Int. 

GópzL, K. [1986]: Collected Works. Vol. 1, S. Feferman, (ed.) Oxford University Press. 

LuMSDEN, C. and WiLsoN, E. O. [1983a]: Genes, Mind and Culture. London, Harvard 
University Press. 

—— [1983b]: Promethean Fire: Reflections on the Origin of Mind. London, Harvard 
University Press. 

Ruse, M. [1979]: The Darwinian Revolution. Chicago, The University of Chicago Press. 

—— [1986]: Taking Darwin Seriously: A Naturalistic Approach to Philosophy. Basil 
Blackwell. 

WiLsoN, E. O. [1984]. Biophilia. London, Harvard University Press. 


^ 


Brit. J. Phil. Sci. 40 (1989), 83-103 Printed in Great Britain 


Perception and Neuroscience 


GRANT GILLETT 


ABSTRACT 


Perception is often analysed as a process in which causal events from the 
environment act on a subject to produce states in the mind or brain. The role of the 
subject is an increasing feature of neuroscientific and cognitive literature. This 
feature is linked to the need for an account of the normative aspects of perceptual 
competence. A holographic model is offered in which objects are presented to the 
subject classified according to rules governing concepts and encoded in brain 
function in that form. This implies that the analysis of perception must consider not 
only the fact that there is an interaction between the perceiving subject and the 
perceived object but also that the interaction is shaped by a system of concepts 
which the subject uses in thought and action. 


Fire may burn our Bodies, with no other effect, than it does a billet, unless the 
motion be continued to the Brain, and there the sense of heat or idea of Pain, be 
produced in the mind wherein consists actual Perception. Locke [1689]. 


. I 


Locke's account of perception was a simple one. Events impinged upon the 
periphery of a human being, they provoked further events in nervous tissue 
which relayed these changes in some form to the brain where the simple events 
are formed into complexes and result in conscious experience. It has obvious 
conceptual relations to the development of mechanistic neurobiology. In an 
evolving form the idea that the mind receives a concatenation of elemental or 
primitive sensory items and these generate its picture of reality has dominated 
thought in the empiricist tradition.’ It has been allied to the thesis that 
consciousness involves access to 'inner' experiences with a certain subjective 
quality and led to a ‘folk psychology’ which is under some strain (Churchland 
[1986]). Now that our understanding of the brain and neuroscience are both 
developing we are in search of new models that will help us make sense of 
puzzling facts such as ‘recognition of conspecifics is not just one aspect of the 

1 A. J. Ayer espouses this view in The central questions of Philosophy (Penquin: Harmondsworth 


1973). Elsewhere I have called this the ‘Empiricist Representational Theory’ (Representation 
Meaning and Thought, Oxford: O.U.P., forthcoming). 


84 Grant Gillett 


general cognitive business subserving recognition of teacups, dogs, trees and 
so forth’ (Churchland [1986], p. 224). I want to offer an approach to 
perception and its relation to the brain that differs from the traditional views in 
what are, I think, promising ways. 

Neuroscience uses a framework derived from information processing theory 
to give an account of the causal transactions that mediate between an 
individual’s sensory contact with the world and the execution of bodily 
movements, and investigates the way that such informational processes are 
carried out in the brain. Philosophical accounts have, to date, contributed to 
this burgeoning scientific enterprise little more than the notion that complex 
perceived entities such as dogs, trees, teacups and persons are present to 
consciousness as a result of the combination of more simple elements (which, 
for some theorists, are preformed according to linquistic types). By exposing 
certain problems in that theory and offering a different theoretical orientation I 
hope to make certain points about the conceptual relations between sensation, 
receptor function, perception, causation, and thought and show how it is that 
these can get so muddled in the discussion of theories of perception. I will apply 
and illustrate the view I favour by attempting to undermine the conceptual 
foundations of a strong physicalist version of the argument from hallucina- 
tion. 


All experience consists in events in the subject's brain which usually but not 
always convey information about the outside world. The fact that in hallucina- 
tion there is no input from the outside world means that what the subject receives 
in experience is something which stops short of contact with external events. 
Therefore (in as much as the difference ts not apparent to him) what he ts 
presented with generally in experience are not the events and objects of the 
outside world which he takes himself to directly perceive but rather events within 
his brain which give him indirect knowledge of the external world. 


II 


Neuroscientific approaches to perception begin with incoming patterns of 
physical events which impinge upon receptor mechanisms. These send 
patterns of neural impulses to the brain where those patterns undergo certain 
transformations in a series of centres each of which modifies the way in which 
information within the array is organized. There is a definite influence exerted 
on the processing of this information by higher functions. This can be 
ilustrated by something as basic as attentional selection and focussing on 
novel or highly significant stimuli (such as an aberrent member of a series or 
one's own name among a list of less meaningful stimuli). The evoked electrical 
activity to such stimuli is of a different type to that which is produced in 
response to the background stimuli among which they appear. This suggests 


Perception and Neuroscience 85 


that the higher recognition which is required to register the complex 
configuration of such stimuli has an effect on the events at an earlier point in 
the processing or input pathway (including the primary sensory receiving 
areas of the cortex). Weiscrantz, in discussing the higher processing of visual 
cues suggests that inferotemporal areas of the cerebral cortex select such 
stimuli for directed and maintained attention (Weiscrantz [1974]). These 
centres are, of course, much further on in the informational flow than the 
centres in which incoming stimuli are selectively reinforced so as to claim their 
share of cortical processing. Pribram calls this a ‘gating’ function of the infero- 
temporal area which acts so as to change receptive fields and their 
responsiveness in the primary visual cortex (Pribram [1971]). 

The problem 1s that this backwards effect from higher to lower levels of 
processing does not have an echo in the philosophical view that we have 
enunciated. An interesting task therefore becomes to devise a philosophical 
theory of perception which does justice to these facts. This requires us to give 
some account of how the organism ‘knows’ which stimuli to select. That this 
has widespread ramifications in perception is suggested by the remark 
'selective attention phenomena play a crucial role even in simple and 
immediate perceptual tasks' (Ullman [1986]). Theorists have also invoked 
hypothesis testing and similar functions to explain the way sensory processing 
is done. 

To help these problems the distinction between a 'thin' or structural-causal 
and 'thick' or epistemological reading of 'information' is needed. 'Thin' 
information is considered as information because it involves structured events 
which, by interacting in a certain way, play a defined role in a process itself 
understandable in epistemic or representational terms. "Thin' informational 
processes are transactions in cause and effect or dispositional terms and they 
involve steps which mechanically follow one another or interact with one 
another; no notion of validity or groundedness is involved. The latter notions 
imply that besides the result of certain processes or mapping transitions there is 
a norm, rule or standard which that result may or may not meet. A thin 
informational account in which structured events and patterns of events 
causally interact cannot allow this epistemic or evaluative presumption to 
creep into the analysis and beg certain explanatory questions which the 
analysis may not have answered.? 

‘Thick’ information is conceptually meshed with talk of knowledge, 
representation (and also misrepresentation) and judgment. It invokes a 
panoply of rational constraints characterizing thoughts about the world. In 
this conceptual milieu the notion of what ought to follow from what (and not 
merely what the system is disposed to do or what tends to happen) is of central 
importance. Discussion of thought introduces the fact that information is 


2 Which is not to say that such an analysis cannot answer those questions but merely that the 
answering of them must explicitly address the normative problem. 


86 Grant Gillett 


being evaluated for validity or groundedness in accordance with certain 
norms. We require an analysis of the relationship between this latter 
conceptual apparatus and the 'thin' informational processes so as to be 
completely clear just what we are and what we are not claiming about what 
goes on in the nervous system and to avoid sliding imperceptibly over 
explanatory lacunae by allowing mental attributions to 'take up the slack' in 
the scientific account. I shall, as many cognitive scientists do, use epistemic 
metaphors where they seem helpful but will also attempt to illustrate their 
conceptual dangers. 

Incoming informational events provoke a change in the ongoing pattern of 
neural activity in the brain in general and the cerebral cortex in particular. The 
effects produced then result either in the generation of patterns of motor 
impulses (which will ultimately be enacted on muscles or glandular activity) or 
in an altered readiness to respond in the brain itself. Given this picture one can 
attempt to identify the mental processes concerned by correlations with 
introspective data. The paradigm provided by Locke suggests that the gradual 
concatenation of simple informational elements into complex wholes may well 
allow the detection of staging posts in the mind/brain’s construction of reality 
(notice that no distinctions are made within the notion of information 
operative here). Phenomena such as colour impressions or even a well formed 
memory image of sitting in grandmother's parlour were reported by Penfield in 
a series of historic experiments on surgical patients (Penfield and Rasmussen 
[1950]). Our conception of information processing is now much more 
sophisticated but we still find puzzles about the point in processing where 
information becomes conscious have a lingering fascination.? Data from brain 
lesions make it possible to locate which bits, if any, are essential for certain 
kinds of informational transformation—say the matching of an array to letter 
shapes, or the naming of colours—but we do not try to locate a cognitive or 
introspectible phenomenon in some piece of the brain in this way. We are 
aware that to identify a centre necessary for the carrying out of some function 
is only a provisional step in understanding the complexity of the task 
concerned. When we do begin to unravel some of the cognitive elements of a 
given task we often begin to look at what we have intuitively thought about 
that task in a new way. This means that the naive picture we have of many 
psychological abilities must be acknowledged as being provisional and 
requiring further analysis. 

Despite this caution and the development of powerful non-linear processing 
models, the philosophical implications of the traditional account seem clear; 
the brain/mind builds up a picture of the world on the basis of elements of the 
perceptual image which the receptor system has received. The mind processes, 
in a mental way, the information ('thick' sense) coming in Just as the brain 


3 Assuming that we know what ‘conscious recognition’ is, which is in some doubt (see K. V. 
Wilkes ‘Is consciousness Important’ In B. J. Phil. Sci 1985). 


“a 


Perception and Neuroscience 87 


carries out analogous physical processes which are the realization of these 
mental phenomena. As certain features and items are detected and assembled 
so the mental (and underlying physical) processes contain information (sense 
unspecified) of greater complexity and action-relevance (or what we might call 
‘a more adequate representation of the world’). This information is processed 
in the higher cognitive, conative and executive reaches of the system. In view 
of this chain of events, it is tempting to claim, with Blakemore, ‘We deceive 
ourselves if we think that our perceptual world is complete. It is what our 
neurones are able to tell us. ... Neurones present arguments to the brain 
based on the specific features that they detect, arguments on which the brain 
constructs its hypothesis of perception.’ (Blakemore [1977], p. 90.) 

In the concluding phrase Blakemore is indicating that a salient feature of 
visual information processing, as it is understood in perceptual/cognitive 
psychology, is that the percetver sees things in terms of their relations to what 
he normally encounters in the world and not in terms of arrays of ‘sensory 
elements'. Blakemore, following Gregory, claims that in this sense the 
perceiver frames hypotheses about what is producing the patterns of 
stimulation impinging upon him.* Some processing views work toward the 
conceptually favoured contents by concatenating simple sensory/receptor 
events in some way but such an ‘assemblist’, or ‘stepwise internal processing’ 
theory of perception is, as the appeal to ‘hypotheses’ makes clear, not adequate 
to fulfil the requirements of cognitive psychology and neuroscience (as Carnap 
may have suggested if asked). For instance, a human perceiver tends to close 
incomplete figures and see notional figures based on the contextual cues in an 
array. This has led to explanations which interpose higher functions— 
‘hypothesis markers'——in the processing of perceptual information. The 
presumption is that the way visual data are organized to form a representation 
depends on the kind of representation the perceiver guesses or expects to be 
appropriate. 

Another problem is that the perceiver, in this picture, is in touch with the 
world only via the medium of a number of intervening events. Worries arise 
from the fact that epistemic attributions require that the perceiver represent 
the object as it really is and thus that there are right and wrong ways of 
perceiving an object and these stand in a normative relation to the cognitive 
and perceptual processes of the indvidual. Mechanisms within an individual 
can only involve complex dispositions to process input in certain ways. But 
somehow our understanding of perception must reveal how a perceiver detects 
what is there in a way not wholly dependent upon his inclination to react thus 
and so but according to categories which dictate how an array ought to be 
classified. When we have explained that then we have explained how it is that 
thick information is available to him. But here the picture which begins with 


* The ‘hypothesis of perception’ theory is one borrowed from R. L. Gregory e.g. in ‘Visual 
Perception and Illusions’ in J. Miller, (ed.) States of Mind London: B.B.C. 1983. 


88 Grant Gillett 


what the individual ‘receives’ (in some paraphysiological sense) and requires 
him to construct a world out of that, provokes a fiendish set difficulties. 


If there were an inner man who looked at the retinal image from inside his head, 
his perceptual processes would require just as much explanation as those of the 
whole man. Indeed, such a homunculus would have a few problems of his own. 
The image that he supposedly looks at is upside down, foreshortened, and the 
wrong size; it exists in two slightly discrepent versions, and it changes constantly 
as the eyes move. Many sophisticated theories have been advanced to solve these 
problems, but real perceivers do not have them. We do not see our retinal images; 
we see the real environment of objects and events. (Neisser [1976], p. 16) 


The problems arise because we take the primitive elements of sensory input to 
the organism and make the thinker, armed only with the propensities of his 
own computational mechanisms, sort them all out into conceptual thoughts. 
‘Descartes’ dilemma’ is the label Blakemore coins for this problem of going from 
representations to knowledge about the external world and he agrees with 
Neisser that an internal rational homunculus will not do. ‘For the source of 
knowledge, for the solution to Descartes’ dilemma, we must look within the 
maps, at the nerve cells of which they are made—the neurons of knowledge’ 
(op. cit., p. 85), He seeks to elucidate the epistemological nature of concepts and 
knowledge by a detatled examination of processes occurring (physically) inside 
the subject. In a similar vein, Dennett describes, in outline, hierarchies of 
homunculi each composed of stupider, or as he says, ‘smaller homunculi but 
more important, less clever homunculi’ until the clearcut epistemic trans- 
actions with which one begins are reduced to simple operations easily 
explicable in terms of neuronal states (Dennett [1978], p. 80). But both 
Dennett and Blakemore ignore the conceptual consequence of assimilating the 
two uses of ‘information’. 

It is true that systematically structured sequences and networks of neural 
events contain thin information from the world. It is not clear that they allow 
us to understand thick information about the world. I would contend that we 
need to clarify just what is involved in thick information in order to 
characterize what it is that neuronal networks are out to achieve and how they 
do it, and so elucidate how the brain goes about its cognitive tasks. I will argue 
that, when we do this analysis, certain features of perception and thought, 
even in the brain damaged, become more understandable. 

There is no question that networks (particularly parallel networks) of 
neurones can detect configurations of events within stimulus arrays but there 
remains a question as to how they settle on just the patterns that fit with 
intersubjective rules of use so as to allow a thinker to perceive things as being 
thus and so according to concepts he shares with others. Such concepts 
essentially have a certain independence of individual cognitive processes. 


Perception and Neuroscience 89 


III 


Neisser realizes that the subject of experience is not merely a locus of processes 
which result from environmental stimulation but in fact makes a contribution 
to perceptual knowledge and that any account of perception must make it 
possible that he is in direct epistemic contact with his world. 


In my view, the cognitive structures crucial for vision are the anticipatory 
schemata that prepare the perceiver to accept certain kinds of information rather 
than others and thus control the activity of looking. Because we can see only 
what we know how to look for, it is these schemata (together with the 
information that is actually available) that determine what will be perceived. (op. 
cit., p. 20) 


He adds 


The schemata that accept information and direct the search for more of it are not 
visual or auditory or tactual, but perceptual. To attend to an event means to seek 
and accept every sort of information about it regardless of modality and to 
integrate it as it becomes available. We pay attention to objects and events not to 
sensory inputs. (op. cit., p. 29) 


Neisser 1s emphasizing that it is the objects and features of the world that form 
the intentional objects of thought. Thus a full account of perception must 
disclose not only the source of the normative features in perception but also 
how they are tled to objects, qua objects, rather than to the sensory input to the 
individual. There is, of course, no question that attention involves the selective 
processing of information but a conceptual account should reveal why and 
how significance is given to certain subsets of information and thus what the 
directing schemata amount to. Neuropsychological support for Neisser's view 
is offered by Luria who also suggests that language is a crucial factor in 
perception. 


The process of perception is thus evidently complex in character. It begins with 
the analysis of the structure perceived, as received by the brain, into a large 
number of componants or cues which are subsequently coded or synthesized and 
fitted into the corresponding mobile systems. This process of selection and 
synthesis of the corresponding features is active in character and takes place 
under the direct influence of the tasks which confront the subject. It takes place 
with the aid of ready-made codes (and in particular the codes of language), which 
serve to place the perceived feature into its proper system and to give it general or 
categorical character. (Luria [1973], p. 229) 


(His remarks should ring bells for those who have paid any attention to Kant or 
Wittgenstein.) What Neisser and Luria want is some idea how to accommo- 
date the 'schematizing' or 'synthesizing' (Kant) knowledge that seems to play a 
crucial role in perception. This calls for a place to be found for what Kant calls 


90 Grant Gillett 


the ‘rules of the understanding’ whereby the world is presented to the subject 
in a structured form apt to ground certain judgments.) Luria’s appeal to 
language is attractive because if we could, in some way, link perceived 
contents to semantics then the normative constraints in linguistic meaning 
would be at our service in attempting to understand perception and the 'thick' 
information involved in it. 

Kant's rules of the understanding which impose conceptual form on 
experience, Neisser's ‘schemata’ tied to objects in the world, and Luria's codes 
linked to natural language all jostle for a place in an adequate account of 
perception. We must also bear in mind the suggestive empirical findings about 
higher to lower 'feed-down' mechanisms and the role of the infero-temporal 
cortex in the direction and maintenance of attention (a faculty which I have 
elsewhere located at the centre of all conceptual thought (Gillett [1988])). 
Lastly, we need to bear in mind our need to fix the source of categorisations 
beyond the dispositions of the individual to react thus and so to presented 
stimulus arrays. Ideally, we want to include these elements in our understand- 
ing of how thought can relate to the obviously intricate structure of the 
neurophysiological processes that underlie it. Even if mental description were 
irreducible to physical description, we would still regard brain states as the 
causal substrate of mental function. 

Neisser must be right to insist that In perception a thinking subject does not 
examine retinal images or other neural phenomena and then build up a 
picture of the world outside as causing those phenomena.’ But, apart from 
Neisser's target, who may well be a straw man, the normative consider- 
ations—rightness and wrongness independent of the dispositions realized in a 
given thinker—imply that we need a basls on which to understand 'schemata' 
which is not limited to, though perhaps allows for realization by, neuronal 
networks. I will attempt to provide such a basis. 


IV 


We are, in fact, not bereft of a way to understand the relation between 
perception and neuroscience merely because the naive 'processing sequence' 
picture is inadequate. Pribram, a psychologist working at the interface of brain 


5 Kant says that conceptual thought involves concepts (or functions of unity of judgment) and 
rules of the understanding, [Critique of Pure reason (trans. N. Kemp Smith) London: Macmillan 
1929, e.g. B171]. When one considers the relations between language, selective attention and 
perceptual contact indicated by Luria, demonstrative judgments will be seen to be particularly 
important. 

5 Such a position is taken both by Davidson in Essays on Actions and Events and C. Peacocke in 
Holistic Explanation. 

7 Bven though the brain and {ts informational processing mechanisms enable the subject to 
represent the world this view is to be deplored, because a private language cannot serve as an 
adequate basis for meaning and there is no way to learn what terms to apply to one's essentially 
private retinal images. 


Perception and Neuroscience 91 


function and cognition formulated, at one stage, an analogy between 
holograms and the nature of the mind. He failed to develop the fruitful analogy 
he produced perhaps because of the vague 'wave-front' picture of neural 
activity on which his exposition of it rested and is lack of a good model of brain 
processing to supplement his theory. It is now possible to remedy both of these 
defects. 

A hologram is an image of the world with three dimensional properties. It is 
made by producing interference effects (on a photographic plate) from two 
interacting laser beams reflected from the array to be pictured. The plate 
reveals no clue as to the image it holds unless the specific type of laser light used 
to generate the image is used to recover the information encoded therein. 
Pribram remarks. 


The properties of the hologram are just those demanded by us to account for 
ordinary perception. I have already made the suggestion that arrival patterns in 
the brain constitute wave fronts which by virtue of interference effects can serve 
as instantaneous analogue cross calculators to produce a variety of moire type 
figures. Now by means of some recording process analogous to that by which 
holograms are produced, a storage mechanism derived from such arrival 
patterns and interference effects can be envisioned. This is possible, since 
reconstructions of images from holograms have many of the attributes of 
perceptions. (Pribram [19691) 


Leaving aside the talk about wave fronts etc. which no one has been able to 
develop in a convincing way, the analogy, as I want to develop it, looks like 
this. 


the laser thé laser beam a picture/ 
world ——— ———» holographic —————— ——239» representation 
plate 


the brain thought/ 


. representation 
interpersonal 


behaviour behaviour 


In this picture or analogy it is interpersonal activity, or interactions between 
communicating thinkers (which for us means human relationships and the 
human practices of communication constituting a language) that present the 
world to the brain in such a way as the brain can, under their influence, serve 
as the basis of mental life. My contention is simple; ‘rules’, ‘schemata’ or ‘codes’ 
that give form to thought are realized in the brain and have the structure that 


92 Grant Gillett 


they do in virtue of the way in which brain function has been shaped in an 
interpersonal milieu. Thus the brain is a faithful information processor which 
records and, under the right conditions, produces meaningful expression and 
behaviour, just as the holographic plate 'records' and can be made to 
reproduce three dimensional information. 


We can begin by looking at the origin of the concepts which figure in 
perception. 

Grasping a concept is an ability to link a whole serles of experiences in virtue 
of some shared feature a subject can detect in or impute to them (Evans [1981], 
p. 104 ff.). How does a thinker come to acquire such abilities? A human being 
starts life equipped with a number of very natural even instinctive reactions. In 
neurophysiological terms these may be mediated by circuitry which has a 
genetic basis in that we are congenitally disposed to react to certain stimuli in 
certain ways. Such natural propensities probably include reactions to 
movement, the human face, bright colours, the tendency to grasp, suck and 
manipulate, emotional responses to pain and pleasure and so on. It is upon 
these natural propensities that concepts can be built. This basis, in congenital 
aspects of brain function, amounts to a natural ‘congruence’ of activity shared 
by conspecific thinkers and forms a foundation on which shared use or shared 
techniques for dealing with the world can be built. Armed with this substantial 
biological congruence with the adults with whom she has to do and the ability 
to modify that behaviour to meet certain standards, a child can build up 
patterns of contact with the world by imitation, shared action, shared 
attention and communication. All these aspects of behaviour are interwoven 
and as they develop the child's brain processing becomes increasingly 
organized by the selective strengthening of some synaptic connections, and the 
elimination of others (Easter et al, [1986]). Note that the ability to suspend or 
exhibit a natural response 'at will', as the judgments of others and/or the 
present occasion warrant, is central and sharply relevant to the issues of 
normativity, causality and teleology raised above. In fact, a child builds up a 
set of normatively constrained or rule-governed reactions which conform to 
certain standards of use and serve certain purposes. For instance, the child 
forms the sounds It utters into words and, in forming these words, increases 
her ability to respond to the world in the range of ways that human society 
make available. Thus, on the basis of shared natural reactions, the child's way 
of seeing the world is being informed and structured by those interactional and 
verbal abilities which she is mastering in expanding and developing her 
thought.® She learns to direct her attention to things that those interacting 
with her have picked out and appreciate the experiences she is having as being 

8 L, Vygotsky, (Thought and Language, Cambridge [Mass.]: M.LT. Press 1962) reports the central 


role of words and the perceived understanding of those words by others in children's problem 
solving. 


Perception and Neuroscience 93 


composed of elements (such as the colour red or a teddy bear) able to be found 
in other experiences. Her experience is thus conceptual experience and 
gradually she acquires a repertoire whereby the way she thinks of things 
comes to involve a structured network of concepts or practices of responding, 
associating, abstracting and grouping according to rules. The child learns to 
expand her informational competence not only by describing her world to 
herself and others and inviting correction of those descriptions, but also by 
discerning the intentions of others, requesting information and so on.? The 
child's information is presented and refined in terms of human reactions and 
interactions and shared patterns of wilfully directed response which reflect the 
(informal) rules for communication and thought that she 'latches on to'. 


V 


Throughout the experience of growing up the brain keeps a cumulative record. 
In the human case, as we have noted, the conceptual system with which 
'input' is going to engage in order to enter into thought is one that draws 
heavily on interpersonal activity. Thus the record is structured in terms of 
categories which emerge within the intentional activity of human beings, to 
which the child is exquisitely sensitive (Donaldson [1978]). Both brain activity 
and information processing patterns are therefore influenced by the indicators 
and markers formed by the words and actions of persons. The world is 
presented as comprising items, features and events which are picked out or 
‘codified’ (Luria)/'schematised' (Kant/Neisser) according to the way people 
react to them. Because of the natural propensities already discussed, a human 
thinker finds it easy to latch on to these ways of systematizing his reactions to 
the world. The child's own actions, the objects and properties he encounters, 
and the regularities he observes in their relations are all recorded in ways 
determined by the human interests and concepts in terms of which they are 
experienced and understood (Curran [1980]). We could say that a child 
acquires techniques, focussed on the use of words as tools or devices, which 
guide and inform his activity within an appreciation of the world: ‘Language is 
an instrument. Its concepts are instruments. . .. Concepts lead us to make 
investigations; are the expression of our interest, and direct our interest’ 
(Wittgenstein [1953], 1.569). 

This brings us very close to the active, hypothesis testing, purposive and 
exploratory nature of perception that cognitive psychology has come to accept. 
By acquiring the semantic tools provided by concepts, a knowledge of how 
terms fit together and the way in which they can and cannot be combined in 
human operations, a child begins to learn 'rationality' so that his thought 
makes sense. In this way the elements of perception—each presented 


9 This enables an understanding of the role of ‘egocentric speech’ as is discussed by Vygotsky (op. 
cit.). 


94 Grant Gillett 


according to the human significance they have and in that way ‘encoded’ in 
patterns of brain processing and informational selectivity—are knit together 
so that the parts are, to some extent, constituted by the place they occupy in 
the whole. For instance, one might see a partially obscured letter in a stimulus 
array first as an E and then as an F depending on what letters are around it. In 
one presentation it may be part of END and in the other of FIT. But this is only a 
stmple example of a much wider human tendency to use contextual cues to 
disambiguate stimuli. A child gains mastery over this powerful technique as he 
learns to draw on widespread processing capacities in responding to what he 
encounters. Sometimes this cognitive ‘work’ will be quite conscious as one 
puzzles away at a difficult case but most of the time partial or incomplete 
stimuli are just smoothly fitted into the coherent picture of the world that is an 
intrinsic part of conscious experience. As a child reflects upon the terms he uses 
in various situations, the terms he might have been inclined to use, and the 
reactions (actual and hypothetical) of others to those uses, he begins to see 
how both the world and his thoughts fit together. For an item to be percetved it 
must be fitted into this project of understanding and thus articulated with 
other experiences according to the concepts and classifications that the child is 
using. Perception is part of one's thought life in virtue of the fact that it is 
constitutively articulated with this structure. Tonly call an articulated process 
a thought: you could therefore say "only what has articulated expression" ' 
(Wittgenstein [1975], #32). 

The development of perceptual abilities and thus the application of concepts 
to what one encounters is faithfully reflected in the ways in which the 
neurones of the brain come to causally affect one another. Any neuronal firing 
can potentially (at least in early infant life) give rise to an indefinite number of 
different patterns of transmitted neural activity. The particular patterns that 
end up being favoured are a function of experience. Given a basic hard-wiring, 
which may be genetically determined, certain synaptic connections and 
certain excitation patterns will emerge to define the effective information 
processing role of any given neurone or neuronal assembly. This means that, 
even though the natural propensities on which concepts are built may be a 
function of genetic endowment, the actual information processing structure 
and function of .he brain and the patterns of activity laid down within it are 
dependent on what it has been exposed to. Thus the microprocessing structure 
that comes to be realized in the informational networks of the human brain 
bears a human stamp; it is a ‘presented’ world rather than a merely 
encountered world. Each assembly of processing units will be attuned to 
significant stimulus configurations which the subject has been directed to 
attend to and notice by the way that configurations within that class have 
been treated as noteworthy by those who 'teach' her. Therefore the human 
brain ts affected by the world as humanly structured in an analogous way to 
that in which the holographic plate is affected by the world via the reflected 


Perception and Neuroscience 95 


laser light. It is the (conceptual) features of the environment as picked out in 
the interactions between conspecific thinkers and the weightings given to 
those features that explains how and why the brain selects certain patterns of 
input so as to realize responses which ‘fit’ rules guiding activity and thought. 

The plate from which we recover a holographic image requires a specific 
method of projection in its genesis and in its subsequent use. The specific 
medium that produces a hologram is laser light and only the same light can 
bring out the features of the image from the apparently cryptic impression on 
the plate. In an analogous way only interpersonal experience will imprint a 
human brain with that information that allows it to be the causal basis of the 
mind as we know it (a 'locus' of articulated and discursive activity). One would 
expect that the human brain would therefore make maximum use of its 
information when what is there is manifest in the context of human 
interaction.!? Indeed, given the placticity of brain function one would suspect 
that there might be a certain sustaining role played by continued human 
interaction in the maintenance of processing efficacy. This is not to say that the 
brain is a completely 'plastic' template or Lockean tabla rasa but only that, 
within the natural propensities that brain structure make available, detailed 
connections reflect human experience and are sustained by it.!! 

Support for this thesis emerges from current developments in parallel 
distributed processing (PDP). A PDP system with feature detectors sensitive to 
elements in an array may be ‘trained’ by a range of simple stimulus 
configurations to detect the recurrent patterns that occur within the range. 
That is, the system detects certain patterns as signiflcant merely because they 
co-occur In its experiential history. However not all arrays are simple and, in 
some, elements will occur with about equal frequency in conflicting types of 
patterns (for instance a matrix may be used to show horizontal and vertical 
lines and there is nothing to distinguish between these groups in terms of 
frequency of occurrence of any element (Rumelhart and Mclelland [1986], p. 
182 ff.)). Where the elements are combined in varying and even conflicting 
ways the system learns much better if it is assisted by cues (or 'correlated 
teaching inputs’) which ‘signal’ or ‘tag’ a given stimulus as belonging to one of 
the specified groups. Such cues can be faded out once the system has developed 
a preference for grouping according to the ‘rule’. 

This latter situation approximates much human learning. Human learners 
are exposed to a huge number number of different stimulus arrays and must 
learn to distinguish quite subtle patterns in which simple cues appear in 
bewildering range of permutations and combinations which differ in signifl- 
cance. Perhaps the powerful selective cues are provided by the responses of 


10 The importance of interpersonal exchanges also suggests why humans take attitudes to 
themselves and enter Into intra-personal dialogue once they have mastered a system of 
discourse. 

11 A feature which we use in the rehabilation of victims of neurological damage. 


96 Grant Gillett 


conspecifics who themselves know ‘what counts as’ an instance of the rule 
being learned. Their contribution is therefore to impart the bases on which we 
‘go on in the same way’ and so come to grasp those ‘rules of the understanding’ 
which the developing thinker uses to make sense of experience (exactly similar 
points will be able to be made about non-human thinkers who use articulated 
rules to guide and inform their activity in the world; my continued discussion 
of human thinkers should not be seen to be making exclusive claims except 
where the nature of the abilities involved entails such). The central place of 
conspecific in this account makes it unsurprising that conspecific recognition 
should be neuropsychologically different from the ‘cognitive bustness’ of 
recognition in general (Churchland, loc. cit). It is suggestive that the areas 
from which the higher influences on stimulus selection and directed attention 
discussed above (infero-temporal areas) are closely contiguous with those 
areas in which recognition of conspecifics and their facial expressions are 
mediated (mesial occipito-temporal cortex (Tranel and Damasio [1985])). One 
could plausibly conjecture that we are neurologically organized so as to extract 
the maximum help for our cognitive tasks from the responses we obsever in 
conspecifics. 

The human brains begins with certain natural structural features that 
ensure a basic congruence in reactions between human beings and these basic 
capacities enable persons to develop shared rules for perceptual judgments. 
These rule-governed abilities Inter alia are the basis of the mental life of 
thinking beings. 

In an analogous way to the way in which the world as presented by the 
medium ofreflected laser beams changes the structure and responsiveness of a 
photographic plate so as to create a holograpic representation, so the world as 
presented to the human subject in the interactions and communications of 
others changes the responsiveness and capacities of the human brain to create 
a unique representation. This ‘representation’ involves processing abilities 
shaped according to a system of human concepts and apt for the cognitive 
tasks performed in perception and reasoning. 


VI 


This approach gives us a way of dealing with the problems with which we 
began. Sensations arising from states in the body or from external objects 
cause events in the nervous system which form patterns of neural activity 
underlying thought and behaviour. The way in which neural events assume a 
causal role in the activity of human beings is, however, determined by the way 
that recognition and reidentification of objects have been shaped by rules for 
the use of concepts and been (contingently) linked to those events. These 
concepts, constituted within the domain of public meaning and knowledge, 
involve items and happenings as they are presented to beings who fall within ~ 


nw 


Perception and Neuroscience 97 


the scope of a common natural history.!? The ‘schemata’ which sort out our 
sensory manifold so as to select those patterns that flt some concept or 
conceptual specification are therefore determined by human patterns of 
activity. Thus the items we perceive are not a function of information content 
which is invariant as presented to different ‘information receivers’. If the 
similarities between two receivers consist solely in physical events and their 
patterns of impingement upon receptor surfaces then the crucial structuring of 
information which occurs on the basis of natural congruence but involves 
elaborate higher order processing activity will be lacking. Therefore the items 
we perceive are misleadingly analysed in terms of physical impingements on 
sensory organs or ‘stimulations of sensory receptors’ (Quine [1969], p. 84); 
they are more accurately seen as figures, objects, events, qualities and 
operations as presented to us within the milieu of human activity. 

Meaning pervades perception, because perception is a human activity in 
which sensory contact with the environment gives us information about the 
things that concern (or potentially concern) us. The brain is adapted to deal 
with information in the highly specific ways determined within practices 
where the subject has learnt rule-governed techniques of search and detection 
and not by its status tn a causal theory. It is not surprising that we talk not only 
about perceiving ‘red’, ‘square’ and ‘movement’ but also, for example, the 
romantic character in some of Mozart's later music, the logical necessity of 
— (a. —a) or the tension in the air during a difficult meeting. Neisser remarks, 
‘In the normal environment most perceptible objects and events are mean- 
ingful.... These perceptions often seem very direct in that we become aware 
of the meanings without seeming to notice the physical details that provide 
evidence for them.’ (Neisser, op cit., p. 70). In other words, it is events and 
objects that mean something to me which form the 'bedrock' of my cognition 
because whenever I classify something as a this or a that or see it as being thus 
and so a higher function imposes form on what I perceive. For this reason the 
'subpersonal' level of analysis may miss what it is that a person percelves and 
completely fail to detect the crucial influences at work in determining the 
elements of experience. To exhibit what ts involved in perception we must 
* reveal the interaction between our sensory contact with the world and the 
rule-governed concepts which are ‘encoded’ in those higher functions 
modifying sensory input. 

We do not construct a repertoire of responses to the environment (be they 
epistemic or pragmatic) on the basis of events which are identical for different 
physical detectors but rather we are initiated into a shared structure of 
meanings which are enabled by but also which influence the structure of our 
brain processes. The input into the human system is in richer terms than those 


12 Wittgenstein would say that understanding how these things determine meaning is to 
understand the ‘grammar of an expression, op. cit., 1953, 1.90, 371, 373. 


98 Grant Gillett 


a physical event theorist usually admits into his account.!? Thought and the 
contents of perception are structured by the ways in which those contents 
have fallen or could fall within the interpersonal activities where we learn to 
perceive and in which our brain takes on the ‘informational shape’ that will 
serve as the causal basis for mental life. 

It is clear that if there is a disorder either in the causal interface between our 
receptor organs and the world or in the brain which will receive the Input from 
the world then important perceptual consequences will follow. Firstly, we will 
lose the congruence which was a condition of the agreement in judgments 
needed to provide us with our repertoire of perceptual and mental acts. 
Secondly, if we have the repertoire already, then the informational patterns to 
which it is attuned will be disturbed and so our thought will lose that vital 
enabling contact with the world by which its content is fixed. Of course, 
cognition depends upon the integrity of the structured brain processes which 
enable both perception and judgment to occur in a regular relation, but mental 
acts can only be philosophically elucidated by attention to the criterial or 
constitutive exchanges and experiences (ordinary non-Cartesian sense) in 
which a thinker has mastered the techniques of using rules to deal with the 
world on the basis of his sensory contact with it and so has grasped the 
concepts involved. Language, as Luria noted, is important because it is so 
prominent in human rule-governed activity and details so many of the 
nuances of classification and response that modulate our appreciation of what 
we perceive. 


VII 


We can now return to the phystcalist version of the argument from 
hallucination. It is argued that in a case of hallucination an apparent 
perception can arise purely from brain events (a number of different causal 
storles are wheeled in to colour the argument). I recognize and report these 
events as perceptual experiences of the world as I take it to be, much in the 
same way as I recognize other internal events such as itches and pains. Thus, it 


is argued, perception consists in knowledge by inference to an external world - 


on the basis of the evidence provided by internal (perhaps brain) events. 
Against this I have argued that perception and its conceptual elements will 
not conceptually reduce to physical events and thus that to understand what it 
is to perceive something is not merely to detail internal dispositions, neuronal 
events or successions of physical states. Mental life may, ‘physically 


13 Much ofthe same point ts made by John McDowell in ‘Criteria, defeasibility and Knowledge Proc 
Brit Acad 1983. 

14 The point is familiar from the arguments of J. Fodor, Psychological explanation New York: 
Random House 1968, D. Davidson ‘Mental events’ tn Essays on Actions and events Oxford: 
0.U.P. 1980 and D. W. Hamlyn ‘Behaviour’ Philosophy xxvili (1953). 


Perception and Neuroscience 99 


speaking’, comprise certain neural events but we need a glimpse into that 
‘meaning structure’ constituted by persons and their activities to elucidate 
mental content, and thus the nature of perceptual competence. Whether we 
approach this from the epistemological or neuro-scientific point of view we end 
up looking for an understanding of how the brain sorts and classifies the 
stream of stimuli with which it is bombarded. Thus we need some account of 
how a perceiver learns what counts as this or that (perceptually) or learns the 
rules governing perceptual judgment. I have also outlined the way in which 
perception-based thought and judgment concerns public phenomena. Some 
concepts, particularly those to do with the bodily states, sensations (such as 
pain or nausea) or mental acts of one of the interlocutors, will have criteria for 
their application which involve an acknowledgement of the different epistemic 
positions in which users find themselves, but the rules for their use must still 
focus on public cues and markers of meaning (‘criteria’). Thus, as I have noted, 
the way in which physical sensations (such as itches and urges) are ‘encoded’ 
in brain events depends on the intersubjective practices of meaning which 
confer content upon them. Therefore, to argue that aberrent causal chains 
leading to false perceptual experience preserve the essential subject-involving 
features of perception, neglects the fact that mental activity is conceptually 
embedded within a public world. The only reason why hallucinations have 
content Is that they borrow it from instances of genuine (direct and indirect) 
perception which have acquired it in a context of rule-governed shared 
responses and natural propensities. What such cases do teach us is that terms 
dealing with inner sensations should not be assimilated to the case of reported 
outer objects of perception but must be regarded as having a less obvious basis 
for their meaning (or a less obvious ‘grammar’). Quite apart from this, we 
should be cautious about an over ready assimilation of the properties of 
hallucinations to normal experience for two reasons. Firstly, it is not at all clear 
that hallucinations ever completely simulate their perceptual ‘cousins’, indeed 
it 1s probable that they are rather deviant likenesses (either in experlential 
figure or ground) of real experience and they can often be distinguished as such 
even by an incompetent perceiver. Secondly, by the very nature of hallucinosis 
it is quite unclear what thoughts can actually be attributed to the subject. A 
person having an hallucination has an inclination, perhaps irresistable, to 
think that she is perceiving something that she is not. We cannot allow that 
the only thing that could cause such an inclination is the brain presenting her 
with a picture which arises more proximately in the causal chain than a 
normal perception. Indeed, when we consider the arguments against it, we see 
that that could not possibly be an adequate characterisation of the situation. 
But we need not opt for this view. To think one is seeing something need not be 
to see something which exists only in thought. 


IOO Grant Gillett 


VIII 


Neisser demanded some account of cognitive structures which would enable 
us to understand how experience could be met by anticipatory schemata apt to 
suit the variety of instances which a concept may subsume and also apt to deal 
with 'the real environment of objects and events'. I have addressed this 
demand by pointing to the rules or normative constraints which tie content to 
a range of situations and the judgments made by concept users in those 
situations. To have a given concept—or structure of 'rules of the understand- 
ing’ (Kant)—is just to have the ability to apply an intersubjective rule for 
linguistic and other behaviour across and within a variety of situations. Of 
course, this is not to say that we cannot learn anything about the way the 
brain builds complexes of information in a form apt to engage with the 
conceptual system from studying the processing of patterns of input events; it 
would be foolish (and empirically discredited) to do so a priori. The best guess 
would be that philosophy of mind, psychology and neuroscience would all 
benefit from 'co-evolutionary development' but not quite in the way 
Churchland envisages. One can conjecture that the complex centrifugal effects 
on receptor function may carry the weightings and structure Imposed by our 
conceptual experience down to the most basic levels of neural function 
subserving perception. For instance, the output from the retinal cells ts 
modulated by effects which originate in the cerebral cortex and which thus 
alter the patterns of firing in the earliest stages of reception of visual 
information (thin sense). But details of the actual way that the brain realizes its 
informational task are not my concern, except in so far as to insist that its 
function will reflect the (public) meaning structure in which it has taken 
'shape'. More complex mental abilities will trade in the currency provided by 
simpler (though holistically connected) sensory classifications. At each stage 
in the mental response the derivation of significance will reflect rules which 
operate within the conceptual scheme the subject has acquired. 

Perception, the process by which sensory contact engages with the subject's 
grasp of the meaning structure and the world it structures, is, of course, 
dependent upon the normal causal interaction between receptor surfaces and 
the world. That fact, however, does not support a causal theory of perception. 
In fact, no sequence of processes gives us the right kind of understanding of 
what perception is in thatit does not necessarily see primacy where the subject 
does (e.g. in mother's face) and it underplays the role of the concept-using 
subject and the rules he is following. 

Neural complexes combine and react to patterns of input in complex ways 
that are sensitive to experience, but, studled apart from the nexus of 
interpersonal and meaningful practices where they take shape, they do not 
explain the content of thought. They obey causal, physiological laws of 


t 


Perception and Neuroscience IOI 


neuroscience.!? Given certain causal antecedents, neural complexes produce 
certain outputs and thus provide the causal basis for thought and meaning but 
the processes involved can neither disclose nor explain the meaningful 
structure of perception and thought. 

Blakemore remarks, 'We seem driven to say that...neurones have 
knowledge. They have intelligence, for they are able to estimate the probability 
of outside events.’ (op. cit., p. 91). But we must demur, it only seems this way 
because the neurones are part of people who think. The neurones faithfully 
react to impingements on the nervous system and people, who participate in 
practices where they have mastered the techniques of judgment that are 
enabled by these patterns of reaction 'estimate the probability of outside 
events’. 

No neurones have knowledge, a neurone no more knows that it is firing as a 
result of the effect of a red object than a blood vessel knows it is contracting asa 
result of a vasomotor reflex. Knowledge talk is just out of place and misleading 
as it tends to produce the illusion that the epistemic task of accounting for the 
rule-governed nature of thought has been discharged. It is, therefore, just not 
true that ‘Descartes Dilemma’ is to be resolved by more knowledge about what 
goes on in a thinker. It is to be resolved by pointing out that what goes on ina 
thinker is, at every level, adapted to the way that life is conducted outside him 
in a rich, conceptually infused world. The source of knowledge, as Nelsser 
remarks, is the real environment and the ‘schemata’ or rules which structure 
our understanding of it. We cannot, tn fact, understand why patterned or 
structured causal interactions in the brain should have the configurations 
they do unless we locate them within a project whose general outlines can be 
sketched in mental terms. 

We can make helpful gains in understanding brain processes by invoking 
metaphores based in interpersonal communication (e.g. ‘evidence’ or ‘analy- 
sis’) but, given the difficult ‘grammar’ of our talk about mental states, must do 
so with caution. We are not quite in the happy position of having a simple 
concept like ‘spin’ to Import into the field of study (as can be done in quantum 
physics). We are constantly walking in a conceptual maze where the 
metaphors we are using relate to workable but complex and non-ostensively 
definable models and invoke the very terms we are attempting to elucidate. 
More often than not the subtleties, presuppositions and conceptual connec- 
tions of epistemic discouse are overlooked or even conflated into an 
enthusiastic but muddled ‘explanation’ of what has puzzled epistemology for 
centuries. A plausible but potentially misleading literature has grown out of 
the idea that neural components can think, feel and exercise intelligence. They 


15 Note that the understanding of these functions itself usually depends upon meaningful thought 
to fill out content specifications—by interview, introspective reports, reasoning tasks or tests of 
comprehension etc. It will also depend on the theoretical model provided by thick (or 
conceptual) information systems. 


102 Grant Gillett 


do none of these things. People (and, arguably, animals) do these things and in 
so doing exploit the amazing capacities of the nervous system. 


Ix 


A major step was taken in science when we realized that inanimate objects 
could move without spirits animating them. Unfortunately in neuroscience we 
often make the retrograde step of ascribing to inappropriate bearers the mental 
and spiritual predicates properly reserved for those who participate in practices 
which enable thought and meaning. It is no wonder that as a result of so doing 


we become profoundly confused and end up with problems about the nature. 


and contents of perception. 

I have tried to use the hologram analogy to show that, even if the meaning 
and content of thought is only to be understood in terms of the milieu in which 
it has taken shape, we can still appreciate the vital informational role played by 
the brain in allowing that milieu to be the way that it is. The holographic 
aspect of human brain function is its adaptation to a way of receiving 
information in which objects and events fall into certain classifications with 
widespread behavioural implications. These classifications are captured by 
human concepts and shared by concept-users. The system of classifications 
imposes a shape on the way that the brain works and in order to understand 
and unlock that shape we need to make extensive appeal to human thought 
and interest. The hologram could be said to bring the importance of this 
perspective back to the surface so that it can be acknowledged in framing an 
adequate theory of perception and thought. 

Aristotle said that the form of the human being was to be a rational and 
social animal and I would contend that even the processing shape of the 
human brain bears the indelible mark of that form. 


ACKNOWLEDGEMENTS 


I should like to thank Dr K. V. Wilkes (St Hilda’s College, Oxford) and Mr P. F. 
Snowdon (Exeter College, Oxford) for comments on an earlier draft of this 
manuscript. 


Magdalen College, Oxford 


REFERENCES 


CHURCHLAND, P. [1986]: Neurophilosophy. Cambridge: M.I.T. Press. 

CURRAN, H. V. [1980]: Cross-cultural perspectives on cognition; in G. Claxton, (ed.), 
Cognitive Psychology: new directions, Routledge and Kegan Paul. 

DENNETT, D. [1978]: Brainstorms. Cambridge (Mass.): M.I.T. Press. 

DoNALDSON, M. [1978]: Children’s Minds. London: Fontana. 


A 


Perception and Neuroscience I03 


EASTER, S. S. Jn, Purves, D., RAKICK, P and Sprrzmr, N. C. [1986]: ‘The changing view of 
neural specificity’, Science, 230, p. 507. 

Evans, G. [1981]: The varieties of reference J. McDowell (ed.). Oxford: O.U.P. 

Guett, G. [1987]: ‘The generality constraint and conscious thought’ in Analysis. 

Locke, J. [1689]: Essay concerning human understanding P. H. Nidditch (ed.). Oxford: 
O.U.P. 1975 Bk II, ch. IX, #3. 

Luria, A. R. [1973]: The Working Brain. Harmondsworth: Penquin. 

Neisser, U. [1976]: Cognition and reality. San Francisco: W. H. Freeman. 

PENFIELD W. and RASMUSSEN, T. [1950]: The cerebral cortex in man: a clinical study of 
localisation of function. New York: Macmillan. 

PRIBRAM, K. H. [1969]: ‘Some dimensions of remembering: steps toward a neuropsy- 
chological model of memory' in Pribram, (ed.) Perception and Action, Harmonds- 
worth: Penguin. 

PripraM, K. H. [1971]: Languages of the brain. New Jersey: Prentice Hall. 

Quine, W. V. O. [1969]: ‘Epistemology naturalized’ in Ontological relativity and other 
essays. New York: Columbia U.P. 

RUMELHART, D. E. and MCLELLAND, J. L. [1986]: Parallel distributed processing: 
Explorations in the microstructure of cognition, vol 1, Bradford: MIT Press. 

TRANEL, D. and Damasio, A. R. [1985]: Knowledge without awareness: an autonomic 
index of facial recognition by prosopagnosics, Science, 228, p. 1453. 

ULLMAN, S. [1986]: 'A.L and the brain: computational studies tn the visual system’ in 
Annual review of neuroscience 9. 

WEISCRANTZ, L. [1974]: ‘The interaction between occipital and temporal cortex in 
vision: an overview’ in (F. O. Schmidt and F. G. Worder (eds), The neurosciences, 
third study program. Cambridge (Mass): M.I.T. Press. 

WITTGENSTEIN, L. [1953]: Philosophical Investigations 1.569 (tr. G. E. M. Anscombe). 
Oxford: Blackwells. 

WITTGENSTEIN, L. [1975]: Philosophical Remarks, (tr. R. Hargreaves and R. White). 
Blackwells: Oxford. 


Brit. J. Phil. Sci. 40 (1989), 105-119 Printed in Great Britain 


Is Confirmation Differential? 


EDWARD ERWIN AND HARVEY SIEGEL 





1 A Puzzle About Confirmation 

2 Three Views About Differentialness 

3 Moderate Differentialness and Theories of Confirmation 
4 The Case For Differentialness 


I A PUZZLE ABOUT CONFIRMATION 


Consider the following puzzle about confirmation. A surprising empirical 
phenomenon, P, is the object of considerable scientific investigation. Five rival 
theorles have been proposed to account for P; although we have excellent 
reasons for thinking that they exhaust the plausible options, we have no 
reason to prefer any one of the five to any of its competitors. All of the five 
provide equally good explanations of our experimental evidence and are 
equally plausible given our background evidence, We have good reason, then, 
for believing the disjunction: 


Ti v Tə v T; v Ta v Ts 


but no reason for preferring any disjunct. 

Suppose that further experimental investigation allows us to eliminate T3, 
T, and T;. We have, that is, gathered additional evidence, E, that disconfirms 
each of these three disjuncts. What is the epistemic status of the remaining 
disjuncts? Does E confirm (say) T}? 

We find this case at least mildly puzzling because we have some inclination 
both to affirm and deny that E confirms T;, and we have found that other 
philosophers disagree with each other about how to describe the case. In favor 
of confirmation, it could be pointed out that the chances of T; being true have 
improved as the result of obtaining E; the probability of T, has increased from 
:2 to *5. We thus have more reason than we had before the discovery of E to 
believe that T; is true. Against confirmation, it might be argued that E does not 
provide any reason to believe that T; is true (or even approximately true); for 
T,’s epistemic standing is no better than its rival T;. 

These two common intuitions concerning confirmation thus appear to 
conflict. On one view, E confirms T; on the other view, it does not. It appears 
that one of the intuitions must give way. 


106 Edward Erwin and Harvey Siegel 


In defense of the second view, we said that E did not confirm T; because it did 
not provide any reason for thinking T; to be true; it provided reason only for 
believing the disjunction: Tı v Tz. What is presupposed here is not only that 
confirmation is epistemic (it provides reasons for belief), but also that it is 
differential. To articulate these two presuppositions, we introduce the notion of 
differential confirmation: E confirms H differentially exactly if E affords at least 
some reason for believing that H, Le. for taking H to be true, and also does not 
afford equal (or better) reason for believing some rival hypothesis, whether 
formulated or not, that is at least as plausible. In our example, E fails to confirm 
T; differentially, for E fails to provide any reason for preferring T; to T2. Does E, 
nevertheless, confirm T;? More generally, is all confirmation differential? 

Because intuitions disagree about this issue, we are skeptical about 
resolving it by a direct appeal to intuitions. Even if intuitions were fairly 
consistent, such agreement might be the result of sharing some false theory 
about confirmation. We are also skeptical about appealing to scientific practice 
to settle this particular issue. Although we will not try to demonstrate this, it 
can be shown that scientific practice concerning the requirement of differen- 
tialness is not uniform. In certain cases, scientists will judge an hypothesis to 
be confirmed even in the evident absence of differentialness, while a reviewer of 
the same evidence will disagree. We might also resort to stipulation. If we 
stipulate that as we use the term 'confirmation' all confirmation is differential 
(or non-differential), then, of course, it follows concerning our puzzle case that 
neither T; nor T; is confirmed (or that both are). However, stipulation does not 
answer the question: As the term ‘confirmation’ is generally used in the theory 
of science, is all confirmation differential? Finally, we might look to some 
theory of confirmation for guidance. But which one? As we shall see later, 
different theories give inconsistent answers concerning the issue we are 
discussing. 

Before determining how our question might be answered, it will be helpful to 
distinguish several positions concerning differentialness. 


2 THREE VIEWS ABOUT DIFFERENTIALNESS 


Some philosophers distinguish between what might be called a ‘weak’ and 
‘strong’ sense of confirmation. In the former sense, an hypothesis is confirmed 
if and only if it has some degree of empirical support; in the latter sense, the 
empirical support has to be substantial (let us say, substantial enough to 
warrant provisional acceptance of the belief that the tested hypothesis is true, 
or approximately true). Even if one holds that there is only one sense of 
‘confirms,’ one can draw an analogous distinction so long as it is conceded that 
confirmation comes in degrees. If it does, then we can say, without postulating 
different senses, that confirming evidence is sometimes weak and sometimes 
strong. 


Is Confirmation Differential? ` 107 


We can now distinguish three different views about differentialness. A strong 
non-differential view holds that evidence that Is non-differential can neverthe- 
less provide substantial support for a hypothesis. A moderate non-differential 
view holds that such evidence can be confirmatory, but can only provide weak 
support. Finally, a differential view holds that non-differential evidence cannot 
be confirmatory to any degree. 

A strong non-differential view is probably not widely held, but is apparently 
presupposed by Fine and Forbes. They write [1986]: 


For the availability of alternative explanatory hypotheses for a given body of data 
is a well known feature of every reasonably complex scientific investigation. 
Moreover, one is generally not able to eliminate all the plausible rivals to a 
particular hypothesis. It follows from the fact that science goes on in this 
situation that for data to count as supporting a hypothesis it is not necessary to 
first rule out all competitors.! 


The above remarks clearly suggest a non-differential view, but by them- 
selves are neutral between a strong and moderate view. However, in the 
quoted passage, Fine and Forbes are criticizing Adolf Grünbaum's arguments 
for the thesis that the clinical evidence for psychoanalytic theory is, in his 
words, ‘remarkably weak’ (Grünbaum [1984], p. 278). One of his central 
strategies is to show that the psychoanalytic clinical evidence is (in our 
terminology) non-differential: the data do not rule out rivals to the psychoana- 
lytic hypotheses that are of equal or greater plausibility. Griinbaum does not 
deny, however, that weak support might be provided by such data. So, if Fine's 
and Forbes' criticism 1s interpreted so as to be relevant to Grünbaum's 
conclusions, they must be taken as saying that strong support can accrue even 
in the absence of ruling out plausible rivals to the hypothesis being tested. If 
that is what they are saying, their argument for a strong non-differential view 
is not convincing. From the fact that science ‘goes on’ even when some 
plausible competitors cannot be ruled out, it does not follow that strong 
support—or any support at all—can accrue to a certain hypothesis in that 
circumstance. Many Freudians have gone on for quite a long time believing 
that central parts of Freudian theory are warranted by clinical data. They may, 
however, be mistaken. 

Another argument for a strong non-differential view is given by Calvin Hall, 
who has provided what has been interpreted as strong experimental 
confirmation of Freud’s oedipal theory. Hall [1963] considers the question of 
whether an alternative theory could have predicted his results, but says that 
the question is ‘irrelevant’. He argues [1963) that there is no need to consider 


1 A similar view is apparently defended by Alexander Rosenberg in his criticism of Philip Kitcher's 
treatment of soclobiology. Cf. Kitcher [1985], pp. 66-7; Rosenberg [1987], p. 80; and Kitcher 
[1987]. p. 92. The dispute between Rosenberg and Kitcher on these pages seems to be precisely 
the dispute between non-differential vs. differential views of confirmation, with reference to 
Darwinian histories specifically and the life sciences more generally. 


108 Edward Erwin and Harvey Siegel 


empirical findings from the point of view of other theories so long as a theory 
meets three conditions: it maintains (a) its heuristic value; (b) its capacity for 
making sense out of a wide variety of phenomena; and (c) its ability to generate 
correct predictions. 

Hall’s conditions are insufficient, i.e., they provide no guarantee that a 
theory will receive strong support from its confirmed predictions. In this 
experiment, Hall made relatively interesting predictions about the sorts of 
dreams his subjects would report, but he could have met his three conditions 
and made quite trivial predictions, such as 'Many subjects will report dreams 
about people’. Confirming such trivial predictions, even if they followed from 
Freud’s oedipal theory, would not provide strong support for the theory. +. 
Because this point is generally conceded in the philosophic literature, and 
because Hall himself would probably agree, we will not argue for our 
contention. Hall, however, can reply that if his three conditions are met and the 
predictions are non-trivial, then strong confirmation accrues to the tested 
hypothesis. However, even this modified view is insufficient, unless 'non- 
trivial’ is interpreted to mean ‘could not have been predicted or retrodicted by a 
plausible rival hypothesis’ (in which case, strong non-differentialness is 
abandoned). Suppose that Hall's predictions are non-trivial in that they do not 
follow from common sense assumptions or from any rival theory that has 
already been warranted. Still, if his experimental design does not rule out (i.e. 
discount to some extent) an incompatible theory that is least as plausible, given 
both background and new experimental data, then his theory is not strongly 
confirmed. For, if it were, and we were aware of the incompatibility, we would 
be warranted in believing what we knew to be a contradiction. The strong 
non-differential view, whether or not it avails itself of Hall's three conditions, 
has the same consequence: We can be warranted, if the view is correct, in 
believing both H; and H; even when we have good reason to believe that the 
conjunction of Hı and H; is a contradiction. 

It could be replied that it is illegitimate to conjoin warranted beliefs in this 
manner. That is, from the fact that belief in H; is warranted and belief in H; is 
warranted, it does not follow that belief in Hj and H; is warranted. This reply, 
however, is quite implausible without some explanation of why belief in the 
conjunction would be unwarranted. Let H; be ‘Opals are mined in Australia’ 
and H; be ‘Opals are mined in Brazil’. One could be warranted in believing each 
hypothesis and yet not be warranted in believing a third hypothesis entailed by 
the conjunction of them. For one might fail to notice the entailment. If, 
however, one realizes that ‘H, and H;' must be true if H; is true and H; is true, 
then one has warrant for believing the conjunction once each conjunct is 
warranted. If I establish that opals are mined in Australia and establish that 
they are also mined in Brazil, then I am warranted in inferring that they are 
mined in both places. Is the situation different where H, and H; are logically 
incompatible? Obviously, if I know that they are incompatible I am not ~ 


Is Confirmation Differential? I09 


warranted in believing their conjunction. This does not show, however, that I 
would not be so warranted if I were simultaneously warranted in belleving 
each conjunct. What it shows, instead, is that I am not simultaneously 
warranted in believing each conjunct once I realize their incompatibility. If I 
am warranted in believing that opals are mined in Australia, but new and 
equally good (or better) evidence indicates that this is not so, my initial 
evidence is undermined. 

Since the strong non-differential view permits the warranting of what is 
known to be a contradiction, that view is implausible. Because we intend to 
reject even the moderate non-differential view for reasons that will, 1f they are 
cogent, tell as well against the strong view, we will henceforth discuss only the 
former. 


3 MODERATE DIFFERENTIALNESS AND THEORIES 
OF CONFIRMATION 


One of the strongest reasons for accepting the moderate non-differential view is 
the identification of confirmation with an increase in probability. Wesley 
Salmon [1975], following Carnap, defines a ‘relevance’ sense of ‘confirmation’ 
as follows (where p=probability; e=new evidence; and b=background 
evidence): 


(R) relevance sense: e confirms h if and only if p (h, e & b)>p (h, b). 


It is immaterial for our discussion whether we take (R) to be defining a 
second sense of confirmation (Salmon also discusses an ‘absolute’ sense) or 
stating necessary and sufficient conditions for a certain kind of confirmation. 
What is important is that (R) permits confirmation in the absence of 
differentialness. In our puzzle case, the discovery of E raises the probability of 
both T; and T; even though it does not differentiate between them; given (R), E 
would confirm both theories. 

So, if (R) is an acceptable theory of confirmation, if any increase in 
probability is sufficient, then differentialness is not necessary for confirmation. 
(R) reflects a view that is widely accepted and deeply entrenched; many 
philosophers would regard it as stating not a theory about confirmation but an 
obvious tautology. Just as many philosophers once used 'necessary truth' and 
'a priori truth' interchangeably, which made it difficult to challenge convinc- 
ingly the view that all necessary truths are a priori, many also equate 'increase 
In probability’ with ‘increase in confirmation.’ We intend to argue, however, 
that the latter equation is incorrect, subject to the following qualification. 

One can use ‘confirm’ in a non-epistemic sense and interpret (R) as 
stipulating that any increase in probability 1s, by definition, an increase in 
confirmation. (Analogously, ‘a priori truth’ can be used non-epistemically; we 


IIO Edward Erwin and Harvey Siegel 


can then stipulate that any necessary truth is in this sense a priori.) However, 1f 
we use ‘confirm’ non-epistemically, the concept can no longer play its 
customary role in the analysis of science. We want to be able to say that in 
confirming an hypothesis, we ipso facto obtain some reason to think it true, and 
in disconfirming it, at least some reason to think it false. If we use ‘confirm’ in a 
non-epistemic sense, we cancel this implication and, consequently, trivialize 
many disagreements about the interpretation of empirical data. In many 
disputes about whether evidence E confirms H, there is a disagreement about 
whether E provides some reason to think H true (or approximately true). That 
disagreement is ignored if we analyse the dispute as being about ‘confirmation’ 
in a non-epistemic sense. 

Assuming that 'confirmation' is used epistemically, (R) encounters a 
number of apparent counter-examples. Suppose that I conjecture that the next 
card to be turned face up from a shuffled deck will be the three of diamonds. If I 
remove the four of clubs—or any card known not to be the three of 
diamonds—then I have increased the probability of my conjecture being true. 
However, the removal of the four of clubs does not confirm that the three of 
diamonds will be the first card turned up. If the removal of the four of clubs 
were evidence for my hypothesis, then it would be equally good evidence that I 
will turn up the king of hearts, the ace of spades, etc. 

The defender of (R) is likely to reply that the removal of the four of clubs is 
evidence for my conjecture about the three of diamonds; it is just very weak 
evidence. Whether or not this reply is plausible, it ts less convincing if the case 
is modified as suggested by an example of Peter Achinstein [1983]. Suppose 
that the deck contains one three of diamonds, 50 four of clubs, and 50 ace of 
spades, The removal of all of the four of clubs is evidence that the next card to 
be turned up will not be the three of diamonds but the ace of spades, despite 
some increase in probability that the three of diamonds will be picked. There 
are also other sorts of cases in which an increase in probability fails to yield 
confirmation. As Achinstein [1983] points out, for example, every time Mark 
Spitz (the Olympic swimmer ) goes swimming, there is an increase in the 
probability that he will drown, but his going swimming is not evidence that he 
will drown. Every day that I survive I increase my chances of living to 105, but 
I have no evidence at all that I will live to exactly that age. The chances of it 
snowing heavily in Miami next year will be greater in winter than in summer, 
but the onset of winter there will not provide any reason to expect heavy snow. 

There are counter-examples, then, to the view that any increase in 
probability is sufficient for confirmation. There is, however, one seemingly odd 
consequence that follows from a rejection of this view. In our puzzle case, the 
probability of T; increased from -2 to -5 as the result of the disconfirmation of 
T5, T4 and Ts. Is it not odd, then, to say that despite these results, we still have 
no reason to think that T; is true? In our view, it is not odd. We may have more 
reason than we had before the experiment to think that T; is true, but it does 


Is Confirmation Differential? III 


not follow that we have any reason to think that. Suppose that I am diagnosed 
as being terminally ill. The discovery that the diagnosis was erroneous, and 
that I am not ill at all, gives.me more reason than I had immediately before the 
discovery to believe that I will live to be 130; yet, I have no (good) epistemic 
reason to think that I will live to that age. Indeed, I now have more reason than 
I did before to think that I will live for 1,000 years; for my negative evidence 
has diminished. Yet, I still have no reason to believe that that will happen. 
Suppose that the state of Florida announces that it will Institute a lottery next 
year. I now have more reason than before to think that I will win the Florida 
State Lottery. All that this means, however, is that my chances of winning 
have improved slightly; it does not mean that I have any good reason to think 
that I will win. In general, saying that as a result of a small increase in 
probability there is more reason than before to think that H is true—that Mark 
Spitz will drown, that I will live to be 130, etc.—does not imply that there is any 
good epistemic reason to think that H is true. If I cross the street, there is more 
reason than if I stay indoors to think that I will get hit by a Jaguar, but crossing 
the street generally does not give me any reason to think that a Jaguar will hit 
me. 

We conclude, then, that (R) is unacceptable: not every increase in 
probability is sufficient for confirmation. If we are right, then one of the 
strongest reasons for rejecting the differential view of confirmation ts based on 
a false assumption. Perhaps, however, we can still reject a differential view if 
we stipulate that the probability be greater than a specifled amount. Consider, 
for example, Salmon's [1975] conditions for 'absolute' confirmation: 


(S) e confirms h if and only if: p (h, e & b)>K. 


Salmon stipulates that K=some chosen number close to 1. On that 
stipulation, all of the previous counter-examples fail: in each case there is no 
confirmation, but the conditions of (S) are not met. However, if (S) sets 
necessary conditions for confirmation, and if K is close to 1, as Salmon 
requires, then all confirmation is differential. For example, if H;'s competitor 
H; is at least as likely to be true, then the probability of H; is not close to 1. In 
fact, as long as 'K' denotes a probability higher than -50, the conditions set out 
in (S) cannot be met by any hypothesis that is not also confirmed 
differentiality.? 

Someone holding a non-differential view might accept (S), but stipulate 
that K equal some number between O and -50, that number being high 
enough to avoid the previous counter-examples. The problem, here, is that any 
such number appears to be arbitrary and unacceptable. For example, suppose 
that we use a single subject design to test the hypothesis that monetary 
2 Achinstein (1983) also argues that the conditions of (S) are not sufficient for confirmation, but 


this point is irrelevant to our concern as to whether differentialness is necessary for 
confirmation. 


II2 Edward Erwin and Harvey Siegel 


reinforcement is generally effective in reducing the drinking of alcoholic 
beverages in chronic alcoholics (see Hersen and Barlow [1976], p. 194). Even 
if the experiment is well designed, the subject might have peculiar character- 
istics that limit the generalizability of our results. Suppose, however, that a 
series of replications is run with different subjects and with entirely favorable 
results, and that each succeeding trial raises the probability of our hypothesis. 
After which trial should we say that we now have some evidence for our 
hypothesis? Even if we could calculate the precise probability generated by 
each experiment, some rationale is required for saying: Now that the 
probability has been raised to -23 (or :38, or +43, or whatever), there is now 
some evidence for our hypothesis. Without such a rationale, it seems arbitrary 
and unacceptable to choose any number between 0 and :50. 

It might be replied that any numerical standard is likely to be arbitary; so, if 
our objection were sound it would apply to any probability account of 
confirmation. We do not think, however, that this is so. The -50 standard does 
have a rationale, whether or not it was in the minds of those who have 
defended such a probability account: the rationale is provided by the 
requirement of differentialness. If all confirmation is differential, then one way 
of guaranteeing that an hypothesis is not confirmed unless it is confirmed 
differentially is to insist that the evidence raise the probability of an hypothesis 
to some point above :50. 

Someone might also argue against differentialness by appealing to a theory 
of confirmation that does not talk about numerical probabllities.? It is likely, 
however, that the problem faced by (R) and (S) will reappear: either the 
standards will be obviously too low to be acceptable or they will require ` 
differentialness. It would be tedious to demonstrate this for every view of 
confirmation discussed in the literature, but consider a couple of important 
examples. 

Consider, first, the traditional hypothetico-deductivist account of confirma- 
tion. On one version (which is non-differential), we confirm an hypothesis by 
deriving from it and other assumptions a prediction that is then discovered to 
be true by observation. As noted earlier, however, in discussing Hall's 
conditions, not every confirmed prediction will suffice for confirmation. A 
spiritual therapist who treats clients with a common cold can derive the 
prediction that most clients will improve after three weeks of intensive prayer, 
but the confirmation of this prediction is not evidence for the hypothesis that 
prayer cures colds. To exclude such cases, Ronald Giere [1977] requires that 
the prediction describe a surprising event. Let H —any hypothesis, IC— initial 
conditions, AA — Auxillary assumptions and P —any prediction. Giere, then, 
sets the following conditions for confirmation: 


3 A non-differential epistemic view not tied to probability avoids objections to treating positive 
relevance as necessary and sufficient for confirmation, or high probability as sufficient (cf. 
Achtnstein [1983]). A differential epistemic view, however, also avolds these objections. 


4 


Is Confirmation Differential? II3 


(T) P confirms H if and only if: (1) If (H and IC and AA), then P; (2) P; and (3) If not (H 
and IC and AA), then very probably not P. 


Whether or not (T) ts acceptable, it clearly requires differentialness. In our 
puzzle case, for example, T; would not be confirmed by E even if conditions (1) 
and (2) were satisfied; for, (3) would not be met. Given the plausible rival T2, P 
would not be very improbable even if T4 were false. 

Consider one more example: an inference to the best explanation approach 
to confirmation. Bas Van Fraassen [1980] discusses (and rejects) a simplified 
version of this approach. Suppose that we have some evidence E and are 
considering two hypotheses H, and H;. We should then infer H; rather than H; 
exactly if H; is a better explanation of E than is H2. Translating this rule into a 
statement, we get the following: 


(U) If we are considering H; and H;, then E confirms H; if and only if H; provides a 
better explanation of E than does H;. 


(U) is non-differential in that it permits confirmation of H; even when there 
is a known third hypothesis H; that is not considered but is superior to H;. 
However, (U) is clearly unacceptable. Suppose that I lose my job and consider 
only two possible explanations: that I once came late to work and that my boss 
does not like Estonians. The second hypothesis may provide a better 
explanation if I am an Estonian, but it may have no empirical support at all if T 
know, but neglect to consider, that I have come to work intoxicated for the past 
week, have twice caused serious accidents during this period, and have been 
leaving work 2 hours early. More generally, E may not support the better of 
two explanations that are considered if neither is likely to be true given the 
antecedent likelihood of a third hypothesis that is not considered. 

Consider one more version of the best explanation view, one that avoids the 
above problem: 


(V) If there ts warrant for believing that either H, or H; or H, correctly explains E, 
then if H, provides the best explanation of E, then E confirms H;. 


If restricted to causal hypotheses, (V) appears plausible, but it requires 
differentialness. If we apply (V) to our puzzle case, neither T; nor T; is 
confirmed by E; for neither provides a better explanation of E than its rival. In 
general, in any case where differentialness is absent, (V) will fail to license the 
confirmation of any of two or more competing hypothesis. 

There are other theories of confirmation, but none that we know of that is 
both plausible and yet implies non-differentialness. We conclude that support 
for a non-differential view, weak or strong, is not likely to be found by 
appealing to any current theory of confirmation. 


II4 Edward Erwin and Harvey Siegel 


4 THE CASE FOR DIFFERENTIALNESS 


Is all confirmation, then, differential? In discussing such a fundamental, 
rock-bottom question, it is difficult to develop a non-question begging 
argument that clearly settles the issue. We think, however, that two things can 
be done in addition to weakening the case for non-differential views. The first is 
to remove apparent obstacles to acceptance of a differential view; the second is 
to cite some positive considerations that tip the balance in its favor. 

One apparently counter-intuitive result of assuming differentialness is that 
every confirmatory experiment must be a crucial experiment. By this we mean 
that every experiment that confirms H differentiates between H and its 
plausible rivals (if it has any),* by affording reason for believing H which it does 
not afford any of its rivals. This result may seem counter-intuitive, and thus 
may appear to support a non-differential view. For, we normally distinguish 
between a well designed and a crucial experiment. However, this apparent 
counter-intuitiveness can be dissolved by drawing an important distinction. 
Crucial experiments, as traditionally conceived, have two elements: (1) they 
give the edge to one hypothesis over its main rival(s); and (2) they settle the 
issue, once and for all, whether a given hypothesis or its rival(s) is (at least 
approximately) true. Crucial experiments in this traditional sense need to be 
distinguished from 'crucial experiments' containing only the first feature; the 
latter differentiate between H, and H; but do not, necessarily, settle the issue 
between the two once and for all. Because one accepts a Quine-Duhem view of 
confirmation (or for other reasons), one might hold that experiments never 
conclusively settle the Issue between two rival hypothesis, but also accept the 
view that all successful, confirmatory experiments are crucial in the weaker 
sense: they provide differential confirmation. It is only in this weaker sense that 
the differential view implies that all successful experiments are crucial. This 
point regarding crucial experiments simply reemphasizes the idea that 
experiments, according to the epistemic-differential view, are designed to 
distinguish between rival hypotheses. It is, on this view, never just H alone that 
is tested in a (successful) experiment; rather, H is tested against at least one 
rival (even in the case in which the only formulated alternative in not-H). 
Experiments are in this sense competitions; a successful experiment never tests 
one hypothesis alone. So, the differential view does not imply the implausible 
conclusion that every experiment that is confirmatory conclusively settles the 
issue of whether or not H is true. Evidence can be differential and yet be 
inconclusive or weak. 

A differential view also does not imply that for H to be confirmed, an 
experimental result must rule out all of H's competitors. The view does tmply, 
however, that competitors not ruled out by the experiment are rendered less 


4 In some cases, H's only plausible rival may be not-H. 


Is Confirmation Differential? II5 


plausible than H by relevant background information. To illustrate, we may 
use a control group in one experiment that permits us to rule out H2 but not H3, 
and then do a second experiment with a different control group that allows us 
to reject H; but not H2. In each experiment, the evidence rules out only one 
competitor to Hı, and yet the results of experiment 2 might confirm H;. The 
reason is that given the results of experiment 1, H; is no longer as plausible as 
H; (assuming that all three hypotheses were equally plausible prior to the two 
experiments). Before we do experiment 2, then, our background evidence rules 
out H2, leaving H1 and H; as the only plausible competition. Differential 
confirmation of one of them is then possible. 

Differentialness thus does not imply these counter-intuitive results. But can 
anything be said in positive support of the differential view? We think there are 
at least two reasons for adopting that view. The first positive reason for 
accepting a differential view is that it gives a better account than a non- 
differential view of why it is important to do controlled experiments. 

Why control variables? If, for example, in investigating the relation between 
gender and mathematical performance the social scientist controls for SES 
(socio-economic status), or in investigating the relation between volume and 
pressure of a gas the physicist controls for temperature, the variables in 
question are controlled in order to rule out plausible alternative hypotheses 
which would not be ruled out if the variables are not controlled. For instance, if 
the social scientist finds that gender and mathematical performance were 
interestingly related, say that males routinely performed significantly more 
poorly than females, but does not control for SES, then the hypothesis that 
gender affects mathematical performance is no more confirmed by the social 
scientist's finding than the alternative hypothesis that SES affects mathemati- 
cal performance. (Of course none of this rules out the possibility that both 
gender and SES affect mathematical performance.) In other words, to confirm 
a hypothesis concerning the relation between gender and mathematical 
performance, plausible alternative hypotheses which also fit the data (e.g. that 
gender has no effect on mathematical performance but that SES does) must be 
ruled out. This is the function of the practice of controlling variables. Similar 
remarks apply to the physics case: in order to confirm the hypothesis that there 
is a negative or inverse relation between the volume and the temperature of a 
gas, the physicist must rule out the alternative hypothesis that it is not an 
increase in volume, but rather a change in temperature, which accounts for 
the decrease in pressure. 

The point is a simple one: the rationale for the experimental strategy of 
controlling variables is simply that that strategy affords the scientist the 
opportunity to rule out hypotheses which are plausible alternatives to the 
hypothesis under investigation, and thus to confirm differentially that 
hypothesis or its negation. This simple point, however, has important 
ramifications. For on the non-differential view of confirmation, this point 


I16 Edward Erwin and Harvey Siegel 


cannot be so easily explained. That is, on the non-differential conception, the 
practice of controlling variables has no obvious rattonale. For the findings of 
the social scientist do offer probabilistic (but non-differential) ‘confirmation’ for 
the hypothesis that gender affects mathematical performance. They do so even 
if SES is not controlled for. But if SES is not controlled for, then the findings 
equally well offer probabilistic ‘confirmation’ to the hypothesis that SES affects 
mathematical performance. The probability of each hypothesis is increased by 
the findings, but neither is differentially confirmed, since the findings afford no 
reason for believing either hypothesis (or a third hypothesis, namely that both 
gender and SES affect mathematical performance). In the uncontrolled case, 
none of these hypotheses is differentially confirmed, though they all have their 
probabilities raised. (Similar remarks apply in the physics case.) The differen- 
tial view of confirmation, then, explains what the non-differential view does 
not: why experimental control is desirable. In the absence of a reason for 
rejecting what is a cornerstone of scientific inquiry, the differential view 
obtains an edge over the non-differential view. 

In assuming the desirability of experimental inquiry, we are not denying 
that causal hypotheses can sometimes be confirmed by uncontrolled case 
studies or by epidemiological studies. Our point, rather, is that in those cases 
where experimental control is necessary for confirmation, it is necessary 
precisely to meet the demands of differential confirmation. 

A second argument for differentialness concerns the epistemic nature of 
confirmation. Consider a case where experimental results are flatly inconsis- 
tent. For example, during the 1960's and 1970's dozens of experiments 
appeared to confirm the hypothesis that systematic desensitization is more 
effective than a credible placebo in treating certain psychological problems. 
However, design flaws were discovered in many of these experiments: most 
notably, the placebo treatments were not as credible as the treatment being 
investigated. In the middle 70's, the experiments were re-done with the design 
flaws eliminated, but the results were inconsistent. Whether or not the earlier 
evidence was probatively worthless is controversial (see Erwin [1978], pp. 5- 
11), but if it was, the new evidence neither confirmed nor disconfirmed the 
clinical hypothesis. More generally, if we run 50 equally good experiments, 
and find 25 positive and 25 negative results, the overall results support neither 
the hypothesis being tested nor its negation. The obvious reason is that there is 
a cancelling effect: the data provide just as much reason to belleve not-H as H 
and, consequently, no (good) reason to belleve either. However, this same 
cancelling effect occurs in a single experiment when the results are non- 
differential. In our puzzle case, for example, E is consistent with both T; and T2. 
Suppose that E confirms both hypotheses. If they are logically incompatible 
(when combined with suitable auxillary assumptions), and E confirms T», then 
from T; (and auxillary assumptions), we can infer not T1. So if E confirms T), it 
equally well confirms its negation. 


Is Confirmation Differential? II7 


Notice: we are not saying that the warrant for T; automatically transfers to 
the denial of Tı. We are not, that is, assuming the following principle: If E is 
evidence for H, and H entails H’, then E is evidence for H’. Once again, one may 
not realize that H entails H'. We are assuming, however, that in the example 
being discussed, we are aware that T; plus suitable auxillary assumptions 
entails not-T;. Consequently, if we have evidence that T; is true, we have some 
reason to believe not-T;. 

So, assuming that we are aware of the incompatibility, if E gives us some 
reason to believe T; is true, it gives us just as much reason to believe that T; is 
not true. There is a cancelling effect exactly analogous to that in the case 
where a series of experiments provide flatly inconsistent results. In both sorts of 
cases, the evidence is not confirmatory to any degree because it provides just as 
much reason to believe the negation of an hypothesis as the hypothesis itself. 
Such a reason is not a good reason to think that the tested hypothesis is true or 
that it is false. 

Despite the foregoing arguments, there are cases that seem to support a non- 
differential view. Consider the following:? 


1. Lottery case. Let E=John and Jim are each given, free of charge, 500 
tickets in a 1,000 ticket lottery, and the winning ticket will win 1 million 
pounds. Let H — John will win 1 million pounds in the lottery; let H' =Jim will 
win 1 million pounds in the lottery. Someone might urge that E provides some 
reason to believe H, even though it provides an equally good reason to believe 
the conflicting H'. Such a person might also say that E provides some evidence 
for H, and also some evidence for the conflicting H'. 

2. Bombing case. Let E=the enemy is being bombed. Let background 
information B=in 55% of the cases bombing causes Its victims to fight harder; 
in 45% ofthe cases it causes them to give up. Let H — the enemy will give up; let 
H' = the enemy will fight harder. Someone might say: given B, Eis some reason 
to believe H, and it is also some (slightly stronger) reason to believe H’. Such a 
person might also say that E provides some evidence for H, and also some 
evidence (slightly stronger) for the conflicting H'. 

Given that we are discussing only an epistemic concept of confirmation, the 
evidence E in Case 1 supports H only if it provides some reason to believe H. 
Does it? If one says ‘yes’ on the grounds that John’s chances of winning the 
lottery have increased as the result of receiving 500 tickets, we have already 
criticized such grounds. An increase in the chances of H being true does not 
guarantee that we have a good reason to believe H. Even if 999 of the tickets 
have been given to Jim, John’s chances of winning would have improved by his 
being given 1 ticket; but John would still have no reason to think that he would 
win, despite the increase in his chances of winning. An increase in probability, 


5 We would like to thank an anonymous reviewer for providing both cases and some helpful 
comments on other parts of the paper. 


118 Edward Erwin and Harvey Siegel 


as argued earlier, is not always sufficient for confirmation of even a weak sort. 
Suppost that no appeal is made to an increase-in-probability view of 


confirmation. Still, given that John now has 500 tickets, it may seem as if he ` 


has some reason to think he will win. Different philosophers, however, are 
likely to have different intuitions about such cases. Furthermore, we have 
already argued that where evidence is non-differential, as in this case, it is non- 
confirmatory. An appeal to intuition at this point in the discussion would be 
question-begging. 

Our reply to Case 2 is the same, except for one obvious qualification. We 
have no difficulty in accepting the idea that E confirms H' to some dégree, but 
this is no violation of the differential requirement. Does E also confirm H? It will 
seem to some that it does, but consider an analogous case. Because no antidote 
is available, Smith will surely die from her recent poisonous snake bite. A 
serum, however, is rushed to the site, one which works in 45% of such cases, 
but kills the patient in the remaining 5596. One might think: at least now there 
is some reason to think that Smith will survive; before her case was hopeless. 
Our reply is roughly the same as that given earlier. Even if the serum killed the 
patient in 99% of the cases, the patient's chances of living would improve if she 
were glven the serum. That fact, however, does not support the hypothesis 
that she will survive the snake bite. Instead, given our background informa- 
tion, we would have good reason to belleve that if she is given the serum it will 
cause her death. The strength of our reason would diminish in both the 
bombing and snake cases if the ‘unwanted’ result —the enemy fights harder 
and the patient dies—occurs in only 5596 of such cases. Nevertheless, the total 
evidence would still support to some degree the hypothesis that the unwanted 
results would occur. Given our earlier argument about the self-cancelling 
nature of non-differential reasons, no support would accrue to either of the 
‘favored’ hypotheses (the enemy will be caused to give up; the patient will 
survive). 

We have argued that all confirmation Is differential. Why, however, does the 
issue matter? So long as it is agreed that scientists seek, and should seek, strong 
confirmation, and that this requires differentialness, why does it matter 
whether or not weak confirmation is also differential? One fundamental reason 
it matters is this. It often happens in sclence—it is customary tn some areas of 
the social sciences—that causal hypotheses are accepted on the basis of what is 
at best very weak evidence. Disputes often arise about the epistemic status of 
such hypotheses. Is there any evidence that EST or psychoanalysis produces 
therapeutic benefits, that large doses of vitamin C inhibit the growth of 
cancerous tumors, that moderate federal deficits cause inflation, or that 
coronary bypass surgery reduces the chances of a fatal heart attack? In 
assessing such disputes, it would be useful to have a relatively clear standard as 
to what counts as confirming evidence. The differential view is not a substitute 
for a full blown theory of confirmation, but it does supply a standard of some 


> 


Is Confirmation Differential? 119 


utility for settling disputes about confirmation. If we reject differentialness, it 
becomes much harder to tell when data do and do not constitute confirming 
evidence for a hypothesis.5 


Philosophy Department 
University of Miami 


$ We would like to thank our colleagues at the University of Miami, especially Leonard Carrier, 
Alan Goldman, Risto Hilpinen, Howard Pospesel and Reed Richter for helpful criticisms of earlier 
versions of this paper. 


REFERENCES 


ACHINSTEIN, PETER [1983]: ‘The Concept of Evidence’, in Peter Achinstein (ed.), The 
Concept of Evidence. Oxford University Press. 

Erwin, EpwARD [1978]: Behavior Therapy: Scientific, Philosophical and Moral Founda- 
tions. Cambridge University Press. 

FINE, ARTHUR and Forses, Mickgy [1986]: 'Grünbaum on Freud: three grounds for 
dissent’ Behavioral and Brain Sciences, 9, 237-8. 

GIERE, RONALD [1977]: Understanding Scientific Reasoning. Holt, Rinehart and Winston. 

GRÜNBAUM, ADOLF [1984]: The Foundations of Psychoanalysis: A Philosophical Critique. 
University of California Press. 

Harr, CALVIN [1963]: ‘Strangers in dreams: an experimental confirmation of the 
oedipus complex’, Journal of Personality, 31, 336-45. 

HERSEN, MicueL and Bartow, Davip [1976]: Single Case Experimental Designs. 
Pergamon Press. 

KrrCHER, PHitir [1985]: Vaulting Ambition: Socioblology and the Quest for Human Nature. 
The M.LT. Press. 

Krrcner, Puri [1987]: Precis for Vaulting Ambition: Sociobiology and the Quest for 
Human Nature. Behavioral and Brain Sciences, 10, 61-100. 

ROSENBERG, ALEXANDER [1987]: ‘Is There Really “Juggling”, “Artifice”, and “Trickery” 
in Genes, Mind, and Culture? Behavioral and Brain Sciences, 10, 80-1. 

SALMON, WEsLEY [1975]: ‘Confirmation and relevance’. In G. Maxwell and R. Anderson 
(eds.), Minnesota Studies in the Philosophy of Science, vol. 6. University of Minnesota 
Press. 

VAN FRAASSEN, BAs [1980] The Scientific Image. Oxford University Press. 


Brit. J. Phil. Sci. 40 (1989), 121-125 Printed in Great Britain 


Two Problems of Induction* 


JOHN O'NEILL 


ABSTRACT 


In this paper I distinguish two problems of induction: a problem of the uniformity of 
nature and a problem of the varlety of nature. I argue that the traditional problem 
of induction that Popper poses—the problem of uniformity—is not that which is 
relevant to science. The problem relevant to science is that of the variety of nature. 


I 
Consider the following schema in second-order logic: 


VH(H#F & H#G)>(((Ha & Fa) Ga) &((—Ha & Fa)>Ga)))}= (1) 
Vx(Fx— Gx)! 


This schema captures a logical interdependence of quantifiers in first and 
second order logic that is of particular relevance to the problem of induction. 
Indeed, this logical interdependence might even be taken (wrongly, as I shall 
show) to provide a solution to the problem of induction. The prablem of 
induction is a logical problem: no finite set of singular statements Fa, —Ga;, 
Fa;—Ga;...Fa,—Ga, entails a universal statement Vx(Fx—Gx), where the 
quantifier ranges over an infinite domain. Given an infinite domain of 
particulars, no finite set,of statements describing observations will allow us to 
deduce a universal law. How then might the logical schema outlined be taken 
to provide the basis for a solution to the problem of induction? 

The suggestion would be the following: while there are an infinite number of 
particular events and objects in our universe, there are only a finite number 


*] would like to thank Bob Hale, Russell Keat and the Journal's referee for their comments on 
earller drafts of this paper. 

1 The bi-conditional from left to right is a statement of the principle involved in argument by 
universal introduction. in natural deduction systems of logic: see Gentzen (1969) Collected 
Papers, (Szabo, ed.) Amsterdam: North Holland, p. 148, and Tennant (1978) Natural Logic, 
Edinburgh: University Press. Proof of the bi-conditional from right to left is straightforward. 

? See Popper (1959) The Logic of Scientiflc Discovery. London: Hutchinson, p. 27 for a classic 
statement of the problem. 


I22 John O'Neill 


of properties of such events and objects;? hence, since the domain of properties 
of objects and events is finite, one can, from the finite set of singular statements 
P 


P={(H, ZF & H; ZG) (((H;a & Fa) Ga) & ((—H,a & Fa)Ga)), 
(H4 4F & H; zz G)—(((H2a & Fa) Ga) & ((—H5a & Fa) Ga)) 


(Ha #F & Ha G)(((Haa & Fa)Ga) & ((—Hna & Fa) Ga))) 
deduce the universal statement 
VH((H ZzF & HAG) (((Ha & Fa) Ga) &((—Ha &Fa)Ga)). © (2) 
From (1) and (2), we can deduce 


Vx(Fx— Gx). (3) 
Hence from a finite set of singular statements one can deduce a universal 
statement. 

This argument is clearly invalid. It involves a logical sleight of hand. It is not 
possible to have an infinite domain of particulars given only a finite set of 
properties ranging over those particulars. The possibility is ruled out by the 
identity of indiscernibles (understood in its wide sense i.e. including reference 
to positional as well as non-positional properties).* In general, a finite number 
of properties, n, could, at most, range over a finite domain of 2" distinct 
individuals—there cannot be more individuals than there are distinct subsets 
of a set of n properties. Given an infinite set of particulars one must have an 
infinite set of properties that individuate those particulars. The proposed 
solution to the problem of induction must, therefore, fail. 

However, further elaboration of the grounds for this failure points to two 
distinct forms of the problem of induction, a problem of the uniformity of 
nature, and a problem of the diversity of nature; and two related forms of 
fallibilism, one found in the Hume-Popper tradition, the other in a tradition 


* A more sophisticated version of this assumption is found in Keynes: 


The objects in the field over which our generalisations extend do not have an infinite number of 
Independent qualities; that, In other words, their characteristics, however numerous, cohere 
together in groups of invartable connection which are finite n number. (Keynes, 1921, A Treatise on 
Probability, London: Macmillan, p. 256.) 


Keynes’ position differs from the one I state here in ways which will become clear. 

* Asit was originally introduced by Leibniz, the principle of the identity of indiscernibles included 
reference only to non-positional properties. As such it is not clearly true at all, let alone a 
necessary truth. However, in its wide sense, including reference to both posttional and non- 
positional properties, it is clearly a necessary truth: for two objects to be distinct, there must be 
some property that differentiates them. For a discussion, see A. Quinton (1973) The Nature of 
Things, London: Routledge and Kegan Paul pp. 24-8. 


Two Problems of Induction I23 


that stems from Engels. In the rest of this paper I examine these different 
positions. I argue that the latter raises problems of more relevance to the actual 
pursuit of science. 


2 


The flaw in the proposed solution to the problem of induction lay in the 
incompatibility of an Infinite set of particulars and a finite set of properties 
ranging over those particulars. Thus, if our laws of nature range over an 
infinite domain of objects and events, then there must be an infinite set of 
properties ranging over those objects and events. The two distinct problems of 
Induction arise from different accounts of the basis of this Infinity of properties 
in nature. 

The problem of induction that Popper develops from Hume locates the 
source of that infinity in the non-finitude of time and space. An object or event 
can have an infinite number of temporal and spatial locations. Thus there is no 
set of premises P from which one could deduce a statement of the form (2). The 
quantifier in (2) ranges over an infinite set of predicates describing the possible 
temporal and spatial co-ordinates of objects and events. This account of the 
infinity of properties in the natural world is plausible: after all, it is in terms of 
spatial and temporal locations that particular events and objects are 
standardly individuated. However, the problem of induction that this account 
of nature's infinity raises is one that is irrelevant to the practice of science. 

The problem that ts raised by this account of the infinity of properties in 
nature is that of the uniformity of nature. The laws of nature that apply in the 
temporal and spatial slice of the universe we have observed might not apply 
elsewhere. At some future moment in time or at some unobserved place tn 
space, the laws of nature might no longer hold. Nature might not be uniform.? 

While this problem blocks any deductive argument from a finite set of 
premises P to a statement of the form (2), it is a problem that is irrelevant to 
actual scientiflc practice. Science does assume the uniformity of nature. It 
assumes that mere changes in space and time are causally irrelevant. Keynes 
usefully states this assumption thus: 


A generalisation which is true of one instance must be true of another which 
differs from the former by reason of its position in time or space. 


I will refer to this later as Keynes’ principle. That science, as it is practiced, does 


5 In the following I use Keynes' formulation of the principle that changes in temporal and spatial 
position are causally irrelevant. The principle is explictly stated within modern physics in terms 
of the homogeneity of space and time. For a discusston of the principle, see J. Lucas (1984) 
Space, Time and Causality, Chapters V and VIII; also, cf. E. Nagel (1961) The Structure of Science, 
London: Routledge and Kegan Paul, pp. 316-24. 

$ Keynes op. cit., p. 256 


124 John O'Nelll 


in fact methodologicall presuppose Keynes' principle is apparent in the 
following two ways. 

First, Keynes' principle is assumed in denying that the mere repetition of an 
experiment is a rational strategy for falsifying laws: if the position of an object 
in time and space were causally relevant, then the repetition of our experiment 
would be a rational strategy in attempting to falsify a law—indeed one should 
indefinitely repeat an experiment. An assumption that changes in temporal 
and spatial location can make a difference is incompatible with 'the law of 
diminishing returns from repeated tests'." (Furthermore, when experiments 
are repeated in science, they are not repeated in order to test the causal efficacy 
of spatial and temporal location: one repeats to ensure the reliability of an 
experimental procedure.) Second, and more seriously, if a universal generali- 
zation is falsified ata particular place and moment in time, it would be regarded 
as unacceptable to account for the falsification in terms of the peculiar causal 
properties ofthat particular temporal and spatial location. To do so would be ad 
hoc. Thus the rational pursuit of sclence according to Popper's own fallibilist 
canons relies on the assumption that mere changes in spatial and temporal 
location are causally irrelevant. 

If science does assume Keynes’ principle, and if the infinity of nature's 
properties consist solely in the infinity of possible temporal and spatial 
locations of particular objects and events, then, with a few adjustments, the 
justification of induction proposed earlier begins to look plausible once more. 
One can state Keynes’ principle in terms of a reformulation of schema (1) thus: 


let L be the set of all possible spatial and temporal locations of any object or 
event, then 


VH (HÆF & HAG & HEL) (((Ha & Fa)-Ga) & ((—Ha & Fa)» Ga)))1*) 
i EZVx(Ex—Gx) 


Now, if one assumes that there are only a finite set of properties of nature other 
than those of temporal and spatial location, H,,H2... Ha, then one can, from 
the finite set of statements P*, 


P*—((H; £F & H, 4G &H,£L)— (((H,a & Fa)-»Ga) & ((—H;a & Fa)—Ga)) 
(H2 #F & H5 ÆG & HL) (((H5a & Fa)-+Ga) & (( —B5a & Fa) Ga)) 


(Hn #F & Ha 5G & Ho£L)— (((Hna & Fa)Ga) & ((—Hya & Fa) Ga))) 
deduce the universal statement, 
VH (HÆF & HÆG & HéL)- (((Ha & Fa)-+Ga) & (( —Ha & Fa) Ga)))(2*) 


7 Popper (1972) Conjectures and Refutations, London: Routledge and Kegan Paul, p. 240. 


~ 


Two Problems of Induction I25 
From (1*) and (2*) one can deduce 
Vx (Fx Gx). (3) 


Assuming the uniformity of nature, which science as an activity does, 
induction is justified. From the finite set of singular statements P*, one can 
deduce the universal statement (3). 

The second problem of induction involves the denial of an assumption of this 
reformulation of the proposed solution to the problem ofinduction. The infinity 
of nature's properties is not exhausted by the infinity of possible temporal and 
spatial locations of objects and events. There is a potential infinity in the 
variety of properties in the natural world. There are two forms which the 
infinity of the variety of nature's properties might take: first, a quantitative 
infinity in the 'intensity' of some qualities in nature—for some new range of 
intensity a previously established generalization might break down, as 
Newton's laws do for speeds approaching the speed of light; second, a 
qualitative infinity in the variety of qualities in nature—this assumption is 
developed by Bohm.? The problem of induction thus stated is then a problem of 
the diversity of nature: our experimental arrangements cannot capture the 
variety of properties in nature. The problem of the infinite variety of nature's 
properties lies at the basis of Engels’ Popperian-sounding assertions of 
fallibilism.? In this context it has been usefully stated by Lakatos: 


As the universe is infinitely varied, it is very likely that only statements of infinite 
length can be true.!? 


If there is a problem of induction relevant to science, it is the problem of the 
‘variety of nature rather than of the uniformity of nature. In particular, this 
account of the problem of induction makes better sense of the role of 
experiment and the continuation of experiment in science. It is the attempt to 
capture the possible variety of nature’s properties that is the rationale for 
experimental control and the continuation of experiment. One controls 
experiments in order to limit the number of non-essential properties that might 
influence the outcome of an experiment. One replicates experiments to check 
for adequate control in previous experiment. And one develops new experi- 
ments to test a sclentific law in hitherto unobserved conditions (not in hitherto 
unobserved times and places).1! The problem of induction relevant to science 
is the problem of the vartety of nature. 


Philosophy Department, University College of North Wales 


* p. Bohm (1957) Causality and Chance in Modern Physics, London: Routledge and Kegan Paul, 
especially Chapter 5. 

? See Engels, Anti-Duhring, London: Lawrence, n.d., section II and Engels (1954) Dialectics of 
Nature, Moscow: Progress. For a discussion see J. O'Neill (1986) ‘Scientific Socialism and 
Democracy: A Response to Femia' Inquiry, 29, pp. 346-7. 

10 Lakatos (1978) ‘Necessity, Kneale and Popper’, Worrall and Currie (eds.) Lakatos, Philosophical 
Papers 2, Cambridge: University Press, p. 123. 11 See Keynes, op. cit., p. 240. 


Brit. J. Phil. Sci. 40 (1989), 127-134 Printed in Great Britain 


DISCUSSION 


Glymour on Deoccamization and the 
Epistemology of Geometry* 


ABSTRACT 


Three lines of argument are employed to show that Glymour's position on the 
epistemology of geometry is probably not as strong theoretically as the position of 
the underdeterminists whom he attempts to refute. The first argument centers on 
Glymour's implicit use of a realist position on intertheoretic reference, similar to 
that employed by Boyd and other realists, Citations are made to various portions of 
Glymour's work, and the relationship between the imputed theory of reference and 
Glymour's position spelled out. The second line of argument refutes Glymour's 
contention that his criteria for choosing among theories are strongly based and not 
a priori, and a third line of argument rephrases Glymour's original position in a 
way which more clearly shows its implicit base and manifests as "well the 
vagueness from which the formulation of the position suffers. It is concluded that a 
thorough examination of Glymour's deoccamization yields conclusions similar to 
those of the underdeterminists. 


In his article 'The Epistemology of Geometry', Clark Glymour [1977] argues 
against the underdeterminists. Although Glymour's argument is well fleshed- 
out, it is essentially a realist argument and relies on realist notions of what is 
constitutive of useful criteria for the evaluation of a scientific theory for its 
impact. In this paper I shall argue that, although Glymour’s position seems 
initially appealing, it does not go through because it assumes a problematic 
realist position with regard to intertheoretic reference of terms similar to that 
proposed by Richard Boyd and other theorists, a position which Glymour fatls 
to state explicitly. In addition, I shall argue that there is no reason to assume 
that Glymour's strategy (the ‘bootstrap strategy’) relies on principles that are 
soundly backed. Finally, I shall conclude that when Glymour's position is more 
fully set out, it plays into the hands of the underdeterminists. 


* This paper is the result of work begun in a seminar on Philosophy of Science held at Rutgers 
University in the late 1970's. Work was continued at the University of California, Santa Barbara 
under the generosity of Grant DOE (OSERS) G0083/03651. I would like to thank colleagues from 
both institutions, and I owe a special debt of gratitude to an anonymous reviewer for the journal. 


128 Jane Duran 


Glymour begins by noting that, according to previous accounts of the views 
of those who believe that the geometrical features of the universe are 
underdetermined (including Poincare, Reichenbach and Sklar), 


What one cannot do, if this catalogue of options is complete, is to admit the 
notion of empirical equivalence, admit an account of the sameness of meaning 
which permits that different theories might save the phenomena, deny that there 
are available a priori principles about what is most likely true, and still insist that 
arguments such as Poincare and Retchenbach give do not establish the 
underdetermination of geometry. Of the options, listed and unlisted, I think this 
uncatalogued view ts the one closest to the truth. (Glymour [1977), pp. 227-8) 


Glymour believes that one theory (that is, one version of the epistemology of 
geometry problem) is 'better tested than the other by the body of evidence in 
question’, ([1977], p. 228) so, of course, Glymour wants to take a strong stand 
against the underdeterminists. The portion of the article immediately 
following this assertion deals with Glymour's views on such topics as 
synonymy, evidence and confirmation; because his views on confirmation 
seem to be related in a fundamental way to the rest of his argument, I will 
mention these with some specificity. 

Glymour will eventually claim that there is a ‘strategy’ which can aid us in 
determining how well theories are supported by the evidence, hence aid us in 
opposing underdetermination. In the sub-section of his paper entitled 
'Confirmation', Glymour writes: 


It is essential that the computations of theoretical quantities from empirical ones 
not be so constructed that an instance of the hypothesis to be tested would result 
whatever the values of the empirical quantities might be. [In other words, thatit 
not be vacuous confirmation of the hypothesis.] This does not prevent one from 
using the very hypothests to be tested to determine values of certain theoretical 
quantities (one does that, legitimately, in curve-fitting), but it does prohibit using 
the hypothesis itself in certain ways. . . . The strategy is not circular at all, though it 
is a bit of a bootstrap operation. One claims that if certain of the principles of the theory 
are true, then certain empirical data in fact determine an instance of some theoretical 
relations, and moreover if the data had been otherwise a counter-instance of that 
relation would have been obtained. [Emphasis mine.] (Glymour, [1977], p. 233) 


The point here is simply that Glymour is affirming belief in certain of the 
more common principles of scientific hypothesis testing, and that the 
instantiation of some theoretical relation to which he refers is the result of a 
line of reasoning which has the logical form of modus ponens. Glymour also 
notes that '. . . the hypotheses used in testing a hypothesis may themselves be 
in error, so that the instances or counter-instances obtained with them are 
spurious. The only means we have to guard against such error is to test the auxiliary 
hypotheses used in computing theoretical quantities and to test the orlginal 
hypothesis using a different combination of auxilliary hypotheses to determine 


y 


Glymour on Deoccamization 129 


the values of theoretical quantities’. [Emphasis mine.] (Glymour, [197 7]. 
p. 235) : 

The keystone of Glymour's argument against the under-determinists, 
however, is as follows. Glymour believes that there are at least four ways in 
which we can '... compare how well theories are supported by evidence’. 
([1977], pp. 234-5.) Condensing heavily, they are the following: (1) one 
theory is to be preferred to a second if it does not contain hypotheses 
disconfirmed by a body of evidence, and the other does; (2) a theory containing 
fewer untested hypotheses is to be preferred to one containing more; (3) a 
theory which has been tested in ways involving a greater variety of hypotheses 
is to be preferred over another, other things being equal; and (4) a theory, the 
body of evidence for which may be explained in a ‘more uniform way than 
another’, is to be preferred over another theory. Glymour makes it clear that he 
does not think these.principles are a priori. (Recall our quotation at an earlier 
point that we can 'deny that there are available a priori principles about what 
is most likely true...') He also asserts that these principles may aid us in 
distinquishing among the various theories offered as evidence that the 
epistemology of geometry is underdetermined. 

Such preferences are not founded, or rather need not be founded, on a priori 

conceptions about how the world is or likely is; they are founded on the 

preference for better tested theories, and the various modes of comparison are 
only aspects of that preference. ... I think there is rigor enough, however, to 
distinguish unambiguously among candidates that are offered in demonstration 

of the undetermination of geometry. ([1977], p. 236) 


At this point Glymour is ready for his major argument. Glymour's most 
telling criticism of the type of view espoused by the underdeterminists actually 
comes in the form of an analogy. He argues that one might be faced with a 
situation in which a high school student wanted to establish a theoretical 
alternative to Newton's constructs. The student would note that any 
appearance of 'force' in one of Newton's equations could be replaced by the 
appearance of ‘two distinct quantities’, ‘gorce’ and ‘morce’. Since gorce and 
morce together would do the same work as force alone, the high school student 
would confront the teaching physicist with the question, ‘Why not gorce and 
morce, rather than force?’ Glymour finds all of this similar to some of the 
premises in the argument for the underdetermination of the epistemology of 
geometry. He writes: 


If to the gorce plus morce theory we add the hypothesis that force is equal to the 
sum of gorce plus morce, then the theory, with this addition, entails Newtonian 
theory and every test of Newtonian theory is a test of the identical fragment of the 
expanded gorce plus morce theory. But there are no tests of the hypothesis that force 
equals the sum of morce plus gorce, nor are there any tests of those hypotheses that 
contain ‘gorce plus morce’ in place of ‘force’. The bootstrap strategy, then, gives 
formally what I should say informally. No surprise there. 





130 Jane Duran 


My thesis is that theories advanced to demonstrate the underdetermination of 
geometry bear a relation to ordinary theories very much like the relation the 
gorce plus morce theory bears to ordinary Newtonian theory, and are infertor for 
much the same reason. ([1977], p. 238) 


The remainder of Glymour's paper deals with the more technical aspects of 
arguments having to do with the metric, notions of substantival space, and 
alternative accounts of phenomena which might possibly be described by some 
extant version of the general theory of relativity. The notions of what may 
count for or against a scientific theory, developed by Glymour in the earlier 
portion of his paper—and already discussed here—are crucial in his attempts 
to distinguish among these alternative accounts. Glymour concludes his paper 
by noting: 

The sorts of theories Reichenbach and many others have suggested, whether 

they be understood in classical or relativistic contexts, are Just not as good as the 

theories they are supposed to prove to be underdetermined.... whatever may 
turn up, it is bound to be more palatable than is the radical underdetermination 

of geometry. ([1977], p. 250) 


Now It seems to be part and parcel of Glymour's argument in the 'gorce and 
morce example' that there is a problem with regard to putative referent(s) as 
applied to ‘gorce’ and ‘morce’, and that the problem, at least on one construal, 
when resolved plays into Glymour's argument against underdetermination. 
But, I shall argue, Glymour falls to make this problem clear, and his implicit 
construal of the terms of the problem begs the very question at hand and 


weakens his argument. 
The way in which the high school student 's hypothesis is phrased employs 
the crucial notion that 'gorce plus morce’ '. . . acts exactly as Newtonian force 


does'. ([1977], p. 237) may be made in the appropriate equations, salva 
veritate. But the difficulty is that there is an ambiguity in the expression 'acts 
exactly as Newtonian force does'. 'Gorce plus morce' may be co-referential 
with 'force'; that is, one entity may be referred to by either expression. On the 
other hand, it is possible that the two sets of terms are riot co-referential, 
because it is at least possible that there are in fact two distinct entities or 
phenomena—'gorce’ referring to one, and 'morce' to the other—which 
happen to act in tandem in a way identical to 'force'. Now it is an oddity, I 
claim, of Glymour's very close analysis that he seems to slide over this 
difficulty. We have already noted that Glymour employs the expression 'acts 
exactly'. In the quote from Glymour on the catalogue of options available to 
the underdeterminists with which we began this paper, Glymour speaks of the 
"notion of empirical equivalence', and '. . . admit [ting] an account of sameness 
of meaning which permits that different theories might save the phenomena’. 
The difficulty is that co-referentiality is notoriously, in our post-Fregean era, 
not the same thing as sameness of meaning, and Glymour’s example of the 


Glymour on Deoccamization I3I 


high-school student's conundrum ignores this fact. I plan to use this lacuna in 
Glymour's argument in a more precise way shortly, but first I think it 
necessary to establish where Glymour stands on the theory of reference 
employed by contemporary realists. 

The new realism, of which Glymour may be taken to be an exponent, has in 
general sought to create a thread of continuity of referent for scientific terms 
despite what might ordinarily be taken to be the changing intension of the 
term involved.! Part of what is at stake here is the age-old and currently quite 
popular battle between explanatory power and predictive utility or force; in 
order to avoid the offensive rubric 'instrumentalist', a number of theorists have 
found it safer to include as part of their overview of scientific theorizing theories 
of reference which allow for referential continuity. One such theorist is 
Richard Boyd, whose work is cited by Glymour: 


... there must operate some principle of ‘matching’ or continuity between the 
theoretical vocabularies of different accepted theories, and... the reliability of 
the principle actually used cannot be explained without adopting a realist model 
of the role of theoretical terms in sclentific theories. What we will eventually 
claim is that these principles operate to guarantee that the referents of the same 
theoretical terms ocurring in two different theories (in the same subject area) 
should be the same, and that this is essential to their reliability. This claim 
cannot, of course, be made from a metaphysically neutral position on the status 
of theoretical terms in scientific theories? 


Now Glymour's argument seems to assume that the theories which are 
adduced as competitors to render the epistemology of geometry underdeter- 
mined must contain theoretical terms which retain the sameness of referent, 
and, if that assumption is made, then, of course, that very assumption can be 
used against those theories, since there is no way to test for anything other 
than one hypothetical entity when previous tests have been couched in terms 
of the referent's being attached to one hypothetical entity. But the alternative, 
of course, is to fail to buy the realist's position that continuity of reference must 
be preserved. 

Glymour has cited Boyd, and Boyd's transcendental argument maintains 
that 'evidence for the acceptability of theories [be seen as] . . . evidence that 
these [theoretical] entities do exist...’ (Boyd [undated, p. 51). To be more 
explicit, the strategy of such realist positions has typically been to contend that 
we can learn something from the fact that science has been so ‘successful’, and 
that what we learn is that the methodological principles which are related to 


! See Glymour [1977], p. 246. See also Glymour's ([1980], p. 105 fn) favorable citation of Boyd 
(who holds a similer view): 'The arguments in the text seem to be of some force against an 
instrumentalist account of scientific discourse . . .'. 

? See Boyd ({undated], p. 82). This lengthy manuscript is a very full and explicit account of Boyd's 
views. : 


132 Jane Duran 


this success rest on a realistic account of theoretical terms. The upshot is that 
comprehending terms realistically provides us with certain sorts of evidence. 
On this argument, it would seem clear that any evidence for the acceptability of 
theories—any confirming evidence structured in the logical form of affirming 
the consequent—works only because, as Boyd has said and as Glymour seems 
to agree, it is evidence that these theoretical entities do exist.? 

On the other hand, if continuity of referent is not preserved, then in an 
argument such as the one employed by Glymour the following occurs. The 
notion (as originally phrased by Glymour) that there are ‘... two distinct 
quantities, gorce and morce; the sum of gorce and morce acts exactly as 
Newtonian force does’ (Glymour [1980], p. 356) is now subject to a different 
interpretation; the two distinct ‘quantities’ may now be thought of as two 
different entities, and the add-on phrase 'acts exactly as Newtonian force does' 
tells us nothing which is helpful. For either 'gorce and morce' do not refer to 
the same entity thought of as two arithmetic components or two 'quantities', 
but rather two or more entities which happen to act in unison, in which case the 
assumption that allows values to be computed assumes too much since it 
assumes '...that an algebraic combination of quantities is a quantity’ 
(Glymour [1977], p. 237) (which in this case does not do the situation 
ontological justice), or 'gorce and morce’ refers to the same entity as 'force'— 
an unwarranted assumption, which, naturally does make 'force' theoretically 
preferable—and which is not specified by the barebones argument Glymour 
has given. 

In other words, the underdeterminists' point seems to be well-taken, or at 
least better taken than one had previously thought. Glymour's argument to 
the contrary seems to constitute an interesting case of vagueness, or perhaps 
an inverted version of what Whitehead called 'the fallacy of misplaced 
concreteness’. There might conceivably be a way to test for the interaction of 
two distinct entities (rather than merely two ‘quantities’); how one could go 
about this is, of course, unclear, but Glymour's statements that ‘... /h/is 
theory is merely an extension of Newton's. If he admits that algebraic 
combination of quantities is a quantity, then his theory is committed to the 
existence of a quantity, the sum of gorce and morce, which has all of the 
features of Newtonian force, and for which thereis exactly the evidence there is 
for Newtonian force...' beg the question, since they seem to assume co- 
referentiality of the terminology. One would not object to this so strenuously 
were it not the case that what Glymour was originally talking about was 
'sameness of meaning'. 

Finally, as I stated at an earlier point, it is important to Glymour that the 
principles which allow one to choose how well theories are supported by 
evidence are not a priori. This was one of the original set of conjuncts in the 


3 Still another explication of this sort of view Is to be found in Suppe ([1977], pp. 650-2, 716-29). 


~ 


Glymour on Deoccamization 133 


argument against the underdetermination of geometry. It may be that these 
principles are not known to us a priori; indeed, a strong case can be made for 
suggesting that the principles are the result of induction. In other words, 
previous experience suggests to us that theories which have the characteristics 
Glymour mentions (‘. . . containing fewer untested hypotheses, etc.) are more 
likely to have predictive utility, explanatory power, etc.—in short, to be 
‘successful’ theories. But surely the case for induction is no stronger here than 
It is anywhere else—indeed, probably weaker, given what we know about the 
history of science—and so the ground upon which these principles stand is 
shaky, a fact which Glymour does not acknowledge. 

Glymour’s strategy, then, probably requires reformulation. But a reformu- 
lated or resuscitated version of the strategy will contain some significant 
differences, and probably will not yield the result at which Glymour was 
originally atming (‘What one cannot do, if this catalogue of options is 
complete ...") ((1977], pp. 227-8). It is not, apparently, the bootstrap 
computations portion of the strategy (cited by us in the first few paragraphs of 
this plece) which needs to be overhauled. Rather, the original catalogue of 
options 


... to admit the notion of empirical equivalence, admit an account of sameness 
of meaning which permits that different theories might save the phenomena, 
deny that there are available a priori principles about what is most likely true, 
and still insist that the arguments such as Poincare and Reichenbach give do not 
establish the underdetermination of geometry. (Glymour [1977], pp. 227-8) 


needs to be rephrased, at least for the sake of accuracy. Admitting an account 
of sameness of meaning which permits that different theories might save the 
phenomena does not make explicit what Glymour would like to belleve is the 
case based on the 'gorce plus morce' example. It is not only that different 
theories might save the phenomena—it is that the referents of different 
theoretical terms which play the same role in competing and closely related 
theories are being assumed (apparently) to be the same and this assumption— 
while never explicitly stated—is being used to discard theorles. As we have 
argued, itis inaccurate and crude to sneak co-referentiality in under 'sameness 
of meaning’; more importantly, however, if the assumption of co-referentiality 
is not made, Glymour’s argument (based on the high school student example) 
seems to be just plain wrong. Assuming co-referentiality is to assume too 
much, at least given Glymour's initial formulation of the problem. Thus 
Glymour's original phrasing does not adequately describe his project; if the 
original phrasing is retained, then what Glymour claims as the argument of 
the underdeterminists probably stands since it speaks only of 'empirical 
equivalence' and 'sameness of meaning', not co-referentiality. 

In this paper two major lines of argument have been used to analyze and 
counter Glymour's contentions about the epistemology of geometry. The first 


134 Jane Duran 


line noted the extent to which Glymour is indebted to a realist position with 
regard to the referents of theoretical terms, and noted that this position 1s 
similar to one espoused by Boyd and others. The upshot of this indebtedness, 
on close analysis, is that positing identical referents for differing terms in 
competing theories is gratuitous and is, in any case, not identical to any one of 
the conjuncts in the position which Glymour originally attributed to the 
underdeterminists. A second line of argument claimed that it may be the case 
that the principles upon which Glymour seems to rely are not known a priori, 
but if they are the results of induction (as seems to be the case) they do not 
stand on a particularly solid base. The paper concludes that a careful analysis 
of Glymour's original position ylelds a conclusion different from that which he 
had intended. 


JANE DURAN 
University of California 
at Santa Barbara 


REFERENCES 


Boyp, RicHarp [undated]: Realism and Scientific Epistemology, unpublished manuscript 
in circulation. 

GLYMOUR, CLARK [1977]: "The Epistemology of Geometry’, Nous, 11, pp. 227-51. 

GLYMOUR, CLARK [1980]: Theory and Evidence, Princeton University Press. 

SUPPE, FREDERICK [1977]: The Structure of Scientific Theories, University of Illinois Press. 


Brit. J. Phil. Sci. 40 (1989), 135-136 Printed in Great Britain 


\ 


DISCUSSION 


Conventionalism in Physics 


If light is the ‘first signal’ (Reichenbach [1928]), then the experiment proposed 
by Stolakis [1986] to reveal any anisotropy of light-propagation relative to the 
apparatus would fail to do so even if tt were moving through an ‘ether’ (or 
more generally, a unique vacuum whose properties determine the speed of 
light) in which light travels isotropically. The problem of measuring the one- 
way velocity of light was treated in detail by Ives [1948], who assumed the 
existence of an ether, and quoted an expression due to Larmor [1900] for the 
velocity of light in a moving, transparent medium. This expression was derived 
on the basis of the physical constraint that inter-atomic (‘electrodynamic’) 
Interactions can propagate no faster than light, so that the Lorentz transforma- 
tions apply to material objects moving through the ether. Larmor seems to 
have been implicitly as aware as were his contemporaries, Lorentz and 
Poincaré, of the sufficiency of this constraint as a basis for the relativity 
principle (cf. Macrossan [1986]), though Poincaré [1905] was alone in 
making the point explicitly. (It is interesting to note that Heaviside [1892] 
appears to have been the first to calculate the ‘Lorentz’ contraction factor for a 
moving group of charges in equilibrium, on the basts of transmission of the 
coulomb forces at the speed of light, though he did not discount the possibility 
of motion faster than light.) 

Referring to Figure 1 of Stolakis [1986], if the speed of the apparatus in the 
ether is w along AB, then according to Larmor the relevant one-way speeds of 
light relative to the observer at C are c+w in the vacuum and c(1 — w?/c2)/ 
n(1-- w/nc) in the medium of refractive index n. By a clock fixed in the ether, 
the duration of light-travel from C to A and back is then t a=} (n+ 1)/c(1 —w?/ 
c?), where I’ =1(1 — w?/c?)!? is the Lorentz-contracted value of the length 1, as 
measured by an observer fixed in the ether. Also, the corresponding duration 
from C to B and back is t'g —l'(n 4- 1)/c(1 — w?/c?) =t a, so that there is no time 
difference in the arrival at C of the returning light pulses, in spite of the assumed 
anisotropy of their propagation. For the hypothetical clock stationary in the 
ether, this duration is a function of w, but since both the length of the 
apparatus AB and the rate of the co-moving clock at C are also functions of w, 
the duration recorded by the clock at C is t4 — t'4(1 — w2/c2)? — tg — I(n 4- 1)/c, 
substituting for F, and is independent of w. Similar considerations also 
invalidate the second experiment suggested by Stolakis. 


136 W. T. Morris 


Thus, even if light-propagation is not isotropic, the principle of relativity 
operates in the proposed experiments and null results are to be expected. 
Prokhovnik [1985] analyses many physical systems from the point of view of 
anisotropic light-propagation: all are found to be relativistic. On the basis of a 
model in which light travels isotropically only in one inertial reference frame, 
the sole physical constraint providing the mechanism for relativity is that no 
‘information’ can overtake light: causality mediated with this limiting speed is 
sufficient for what might be called ‘physical’ relativity (Poincaré [1905], 
Zeeman [1964]). 

If this approach is adopted, the ‘conventional’ elements of relativity are 
reduced to the status of assumptions which may be made, or not, as 
‘convenient’, rather than being of fundamental or axiomatic significance. 


W. T. MORRIS 
Teddington, U.K. 


REFERENCES 


HxAVviISIDE, O. [1892]: Electrical Papers, Volume 2, pp. 490-9 and 504-18. London: 
MacMillan. 

Ives, H. E. [1948]: ‘The Measurement of the Velocity of Light by Signals Sent in One 
Direction’, J. Opt. Soc. America, 38, pp. 879-84. 

Larmor, J. [1900]: Aether and Matter, p. 178. Cambridge University Press. 

Macrossan, M. N. [1986]: ‘A Note on Relativity Before Einstein’, Brit. J. Phil. Sci., 37, 
pp. 232-4. 

PorNCARE, H. [1905]: ‘The Principles of Mathematical Physics’, The Monist, 15, pp. 
1-24. 

Prokhovnik, S. J. [1985]: Light in Einstein's Universe, Dordrecht: D. Reidel. 

REICHENBACH, H. [1928]: The Philosophy of Space and Time, Dover (1957), p. 143. 

STo.akis, G. [1986]: ‘Against Conventionalism in Physics’, Brit. J. Phil. Sci., 37, pp. 
229-32. 

ZEEMAN, E. C. [1964]: ‘Causality Implies the Lorentz Group’, J. Math. Phys., 5, pp. 490- 
3. 


British Society for the Philosophy of Science 
International Conference: 22-24 September 1989 
at the University of Reading 


Evolving Knowledge 
in Natural Science and Artificial Intelligence 


Aim: to bring Philosophers of Science and Computer Scientists (particularly “logic 
engineers" and "knowledge engineers") together for a weekend conference to initiate 
an on-going discussion of overlapping concerns. 

Rationale of the conference: Computer technology is generating powerful new means 
for representing and manipulating knowledge. To develop these tools and use them 
effectively, computer scientists need a better understanding of what is to be represented 
and how human reasoning manipulates knowledge. The discipline, by tradition 
responsible for studying the (albeit non-machine) representations, as well as the 
reasoning conducted within several high grade branches of human knowledge, is the 
Philosophy of Science. 


Provisional Programme 
Friday Evening: 
Opening Address: 
Prof. Clark Glymour, Carnegie Mellon University, Pittsburgh 
Saturday Morning Session One: 
What Kind of Field is Artificial Intelligence? 
Prof. Alan Bundy, Edinburgh Prof. Aaron Sloman, Sussex 
Saturday Morning Session Two(a): 
The Mathematics of Uncertainty 
Prof. Jeff Paris, Manchester Dr. Thoma’ Havránek, Prague 
Saturday Morning Session Two(b) 
Connectionism as a Paradigm in Artificial Intelligence 
Prof. Derek Partridge, Exeter Dr. Martin Davies, Birkbeck College 
Saturday Afternoon: 
Papers by Research Students 
Saturday Evening Session(a): 

Bacon Bytes Back Genetic Algorithms and Scientific Method 
Dr. Peter Gibbins, Open University Dr. Roger Young, Dundee 
Saturday Evening Session(b): 

Naive Physics and Folk Psychology 


Dr. Barry Smith, Manchester Prof. Adam Morton, Bristol 
Sunday Morning: 

Modelling Conceptual Change Implementing Abductive Inference 

Dr. P. Thagard, Princeton Prof. E. Charniak, Brown University 


Correspondence concerning conference c/o: 

Dr. J. E. Tiles, Department of Philosophy, Whiteknights Box 218, Reading RG6 2AA. 

Follow up: the British Society for the Philosophy of Science has established a Special Interest Group for 
Philosophy and Computing Science (sponsored by Michael Jackson Systems Ltd.) which will continue to hold 
one-day conferences and workshops and will publish a bulletin. (Contact Dr. P. F. Gibbins, Faculty of 
Mathematics, Open University.) 


Cambridge 


Now in paperback 
The Birth of History and Philosophy of Science 


Kepler's A Defence of Tycho against Ursus with Essays on its Provenance and 
Significance 

NICHOLAS JARDINE 

'Jardine's book is rich In epistemological and historiographical issues, bold and 
stimulating in Its speculations, and detailed In its use of evidence. ...few will deny the 
substantial contributions made by its author to our understanding of the origin of 
Kepler's advanced conceptions..' Isis 

£ 9.95 net Pb O0 521 34699 1 320 pp. 1988 


Theory of Earth Science 


W. VON ENGELHARDT and J. ZIMMERMANN 

Translated by L. FISCHER 

This book, avallable in English for the first time, deals with the conceptual structure of 
research in the geosciences. In contrast to other books on scientific methodology and 
philosophy, thls book takes an original approach, oriented specificially around the earth 
sciences. 

£45.00 net 0 521 25989 4 384 pp. 1988 


When Did I Begin? 


Conception of the Human Individual in History, Philosophy and Sclence 

NORMAN M. FORD 

Father Ford, a Catholic priest and lecturer in philosophy, presents the findings of his 
study with leading embryologists and attempts to answer the question — how soon after 
fertilization does human life begin? He considers the historical, philosophical, religious 
and scientific aspects of the start of life and relates his findings to modem scientific 
research and the moral question. 

£19.50 net 0 521 34428 X 236 pp. 1988 


Reissue 
Now in paperback 


Origins of European Thought about the Body, the Mind, the 
Soul, the World, Time and Fate 


New Interpretations of Greek, Roman and Kindred Evidence, also of some Basic Jewish 
and Christian Beliefs 

Second Edition 

RICHARD BROXTON ONIANS 

The relssue of this classic work makes available again a fascinating compendium of ideas, 
conjectures and explanations about life, mind, body, soul and human destiny which 
were embodied In the myths, legends and customs of the ancients. 

£12.50 Pb 0 521 34794 7 608 pp. 1988 

Published simultaneously in hard covers 


a Cambridge 
ap University Press 
The Edinburgh Building, Cambridge CB2 2RU 


N 





HARVARD 


MIND CHILDREN . 
The Future of Robot and Human Intelligence 
Hans Moravec 


In this provocative book, the renowned roboticist, Hans Moravec, argues that within 
fifty years, we will achieve human equivalence in our machines, not only in their ability. 
to reason, but also in their capacity to perceive, interact with, and change their 
environment. 

"It's a tonic book, thought-provoking on every page.... Moravec possesses a lucid, 


reassuringly commonsensical style and a flair for analogical simplification’ — New Yorker 
£15.95/cloth ISBN: O 674 57616 0 


THE PASTEURIZATION OF FRANCE 
Bruno Latour 
Translated by Alan Sheridan and John Law 


"What Latour has achieved in this book is to provide a practical example of the 
fecundity, not of theory, but of a set of questions, involving the relation of science and 
society, the nature of power, the relation of knowledge and politics" - Samuel Weber 
£23.95/cloth ISBN: 0 674 65760 8 


Harvard University Press 
126 Buckingham Palace Rd. 
London SWTW 9SD 





Philosophy and Phenomenological Research 


AN INTERNATIONAL QUARTERLY FOUNDED BY MARVIN FARBER 


ERNEST SOSA, Editor 
RODERICK M. CHISHOLM, Associate Editor 


PPR publishes articles in a wide range of areas including philosophy 
of mind, epistemology, ethics, metaphysics, and philosophical history 
of philosophy. No specific methodology or philosophical orientation 
is required in submissions. An abstract not exceeding 150 words and 
two nonreturnable copies of any manuscript submitted for publication 
are required. 


Address All Communications to: 
Philosophy and Phenomenological Research 
BROWN UNIVERSITY, BOX 1947 
PROVIDENCE, RHODE ISLAND 02912. 


Annual subscription rates: $23.00 (foreign, $27.00) for libranes and institutions $15.00 (foreign, $17.00) for indi- 
viduals; single copies $5.75 (double issue, $11.50) and $3.75 (double issue, $7.50), respectively. There is an addi- 
tional charge for postage and handling on all back issues. A circular listing the main contents of the journal and a 
list of available back issues will be sent upon request. Na 





Brit. J. Phil. Sci. 40 (1989), 137-144 Printed in Great Britain 


Two Brains, Two Minds? Wigan’s Theory 
of Mental Duality 


ROLAND PUCCETTI 


1 Introduction ` 

2 Wigan’s Argument for our Having Two Brains 

` 3 Evidence for Double Mindedness in Split-Brain Patients 
4 Wigan's Mis-statement of his Theory 

5 Restaternent of Wigan’s Theory 

6 Current Misconceptions of Double Mindedness 

7 The Evidence for Disguised Transcallosal Inhibition 

8 Conclusion 


I INTRODUCTION 


The occasion for writing this essay is the republication of A. L. Wigan's long- 
neglected classic study, The Duality of the Mind [1844, 1985], edited by Joseph 
E. Bogen and Joseph Stmon. As Dr. Bogen makes clear in his Foreword to the 
volume, award of a Nobel Prize in physiology in 1981 to Roger Sperry for his 
work with split-brain patients has brought widespread attention to the duality 
of the brain; whether this also signals the presence of a dual mind in normal 
humans, as Wigan argued, is of course highly controversial. Bogen néverthe- 
less states at the end of his Foreword that Wigan's 'prophetic vision was over 
100 years ahead of the evidence which has ultimately sustained him (p. xv)'. 
In this essay I shall be concerned whether, or to what degree, Wigan's theory 
has indeed been confirmed by split-brain and related studies. But it will be 
equally important to this task to examine the conceptual framework Wigan 
was working in. 

What exactly was Wigan's theory? Perhaps his most succinct statement of 
this ts found in Chapter XIX of the present edition: 


There are some corrolaries which only need to be named, and their truth is so 
easlly comprehended as to produce instant assent. If, for example, as I have so 
often stated, and now again repeat, one brain be a perfect instrument of 
thought—if it be capable of all the emotions, sentiments, and faculties, which we 
call in the aggregate, mind—then it necessarily follows that man must have two 
minds with two brains; and however intimate and perfect their unison in their 


138 Roland Puccetti 


natural state, they must occasionally be discrepant when influenced by disease, 
either direct, sympathetic, or reflex (201-2). 


Two objections immediately arise from this synopsis. First, it does not follow 
necessarily that if we have two brains we have two minds, anymore than it 
follows necessarily from having two nostrils that we have two senses of smell. 1 
Second, if it were the case that we have two minds, we should know this from 
experlencing two distinct trains of thought, etc., occurring to us simulta- 
neously. Wigan apparently thought he did sometimes experience this, but 
most of us make no such claim. So in order for Wigan's theory (at least as 
stated above) to stand a chance of being true, he would have to explain how it 
is that we do not experience any such mental duality. I believe there is a way to 
explain this, but only by major revision of Wigan's working concepts. 


2 WIGAN'S ARGUMENT FOR OUR HAVING TWO BRAINS 


But I am getting ahead of myself. Let us first examine the reasoning that led 
Wigan to conclude that we normals have not one but two brains.? I think his 
argument to this effect in Chapter IV striking, even if he was obviously wrong 
about tbe brain in other respects (he thought each cerebral hemisphere has 
only three lobes, rather than four; and that disease could be transmitted from 
one hemisphere to the other only through the meningeal coverings, rather 
than the corpus callosum): 


I believe it to be entirely unphilosophical, and tending to important errors, to 
speak of the cerebrum as one organ. The term two hemispheres of the brain is, 
indeed, strictly a misnomer, since the two together form very little more than one 
half of a sphere . .. The two hemispheres are really and in fact two distinct and 
entire organs, and each respectively as complete (indeed more complete), and as 
fully perfect in all its parts, for the purposes it is intended to perform, as are the 
two eyes. The corpus callosum, and the other commissures between them, can 
with no more justice be said to constitute the two hemispheres into one organ, 
than the optic commissure [optic chiasm] can be called a union of the two eyes 
into one organ; and it would be just as reasonable to talk of the two lobes or 
globes of the eye, as of the two hemispheres of the brain (19). 


1 This is not to deny that we have two organs for smelling (the two sides of the olfactory bulb, each 
supplying the ipsilateral cerebral hemisphere with neural input that transduces into olfactory 
sensations): on this point Wigan was absolutely correct. 

2 Wigan's psychological conversion to belief in double mindedness was occastoned, he says, by a 
discovery he made at autopsy early in his career: 


One hemisphere was entirely gone—that was evident to my senses; the patient, a man 
about fifty years of age, had conversed rationally and even written verses, within a few 
days of his death; yet I knew that, according to books, the mind could only manifest itself 
through a complete brain (which is true enough as I now explain tt [he means each 
hemisphere is a complete brain]), and I was in a similar state to that of persons who cannot 
refuse assent to geological facts, yet cannot reconcile them to the writings of Moses, in 
which they have absolute faith (32). 


Two Brains, Two Minds? Wigan’s Theory of Mental Duality 139 


Surely Wigan is on sound ground here. We feel no impulsion to conceive of a 
single organ, ‘the eye’, having two vitreous projections beyond the skin surface 
of the face; why then should we talk of ‘the brain’ as a single organ having two 
fibrous intracranial bulges? At the very least we can concede to Wigan that 
commissural connections between the cerebral hemispheres do not of 
themselves make these a single organ, and thus that it is equally correct to 
speak of ‘the brain’ as two cerebra, two half brains, or even two brains. 


3 EVIDENCE FOR DOUBLE MINDEDNESS IN SPLIT-BRAIN PATIENTS 


But again two brains do not necessarily amount to two minds, and this is where 
the work of Sperry and others on split-brain patients becomes crucially 
relevant to Wigan’s contention. For as a consequence of the therapeutic (for 
relief of grand mal seizures) surgery, or even of natural lesions to the corpus 
callosum uniting the two hemispheres, the patient displays thereafter a well- 
defined disconnection syndrome. This syndrome is not revealed in everyday 
behaviour (indeed such patients can pass a routine neurological examination 
if the physician is not looking for it), but under strictly controlled laboratory 
testing conditions it is unmistakably present. For example, a right-handed 
patient easily names objects palpated out of sight in the right hand (projecting 
most of its sensory fibres to the left, speaking hemisphere), but he or she cannot 
name objects palpated out of sight in the left hand (since almost all the sensory 
fibres of that hand project to the right, mute hemisphere, and there is no 
commissural transfer of the information to the speech hemisphere). Yet the 
‘same’ patient knows what is being palpated in the left hand, for upon 
command he or she can retrieve it from an array of objects behind a screen 
with that left hand (though not with the right hand, in the absence of 
interhemispheric commissural transfer). Since normal individuals like our- 
selves never display such bifurcated behaviour, there is no doubt we are 
dealing here with two independent streams of consciousness, or minds. 

However, the best demonstration of a disconnection syndrome is accom- 
plished in the visual modality, using a tachistoscope. This device back-projects 
on a screen in front of the patient a picture or a word for one-tenth of a second 
or less (long enough to register consciously in either hemisphere, but too 
quickly for normal scanning movements of the eyes to get the information 
from both sides of the screen into each hemisphere). If for example the patient 
is asked to fixate on a spot in the centre of the screen and the word TAXABLE is 
flashed so that the letters TAX fall to the left, and the letters ABLE to the right, 
of fixation, the right-handed patient with a disconnection syndrome will say he 
or she saw the word ABLE. Yet if asked to point to the word seen using the left 
hand, he or she will select the word TAX from a list of words that includes both 
ABLE and TAXABLE. The normal subject like you or I would, of course, say the 
word was TAXABLE and would point to TAXABLE with either hand. 


140 Roland Puccetti 


What explains this difference in behaviour? Given the concavity of the 
eyeballs, light from the right half visual field falls on left (temporal) hemiretina 
of the left eye, and on the left (nasal) hemiretina of the right eye. Neural 
impulses from these hemiretinae then join at the optic chiasm to project back 
to Brodmann’s area 17 (primary visual cortex) in the occipital lobe of the left, 
speaking hemisphere. This is why the patient with a disconnection syndrome 
verbally reports seeing the word ABLE. 

Stmilarly, light.from the left half of the visual field strikes the right (nasal) 
hemiretina of the left eye, and the right (temporal) hemiretina of the right eye. 
Impulses from these hemiretinae then join at the optic chiasm to project back 
to the homologous area 17 in the occipital lobe of the right, mute hemisphere. 
This is why the patient with a disconnection syndrome, using the left hand 
under control of the nonspeaking right hemisphere, points to the word TAX 
and not to ABLE or to TAXABLE. 

In a normal subject like you or I, however, there is a transfer of what was 
seen in area 17 of the right hemisphere via the corpus callosum (actually using 
fibres in its posterior 2/3, called the splenium) to the adjoining area 18 
(prestriate cortex) of the left hemisphere. In this way do the letters TAX join the 
letters ABLE to register there as the whole original word, TAXABLE. But if this 
is what goes on in the left hemisphere of the normal subject, what is going on in 
his or her right hemisphere? As we have already seen, in the split condition this 
right hemisphere is capable of pointing with the left hand to the letters TAX as 
what it has percelved on the left side of the screen. If so, there is no good reason 
to doubt that in the unsplit condition area 17 of the left hemisphere projects the 
letters ABLE across the splenium to area 18, prestriate cortex, of the mute right 
hemisphere, thus forming the original complete word TAXABLE in that 
hemisphere as well. This, then, provides the model of mental duality Wigan 
was looking for a century and a half before. 


4 WIGAN’S MIS-STATEMENT OF HIS THEORY 


Yet Wigan did not find it. Why not? One reason may be that he was confused 
about what, exactly, he was claiming. What he believed he was claiming is set 
out with great clarity in Chapter IV, in the form of four propositions: 


1. That each cerebrum is a distinct and perfect whole, as an organ of 
thought. 

2. That a separate and distinct process of thinking or ratiocination may be 
carried on in each cerebrum simultaneously. 

3. That each cerebrum is capable of a distinct and separate volition, and 
that these are very often opposing volitions. 

4. That, in the healthy brain, one ofthe cerebra ts almost always superior in 
power to the other, and capable of exercising control over the volitions of 


Two Brains, Two Minds? Wigan's Theory of Mental Duality I4I 


its fellow, and of preventing them from passing into acts, or from being 
manifested to others (20). 


` Yet in Chapter XIX we find Wigan saying this: 


I think it may be assumed without risk of contradiction, that the fact of each 
brain being a perfect and complete instrument of thought is abundantly proved. 
That each, while in health, corresponds entirely with its fellow, is obvious from 
the fact that this unison and correspondence give only one result, as in the case of 
the two eyes producing single vision (204). 


Even on the face of it, this statement about the two eyes giving only one 
result, i.e. one visual percept, contradicts Wigan's own proposition 2 above, 
since if visual perception is a mental activity then on his theory it should occur 
‘separately’ and ‘distinctly’ in each cerebrum simultaneously. And if it did, that 
is if our consciousness spanned both cerebra, then we should perceive any 
object of perception twice side-by-side: i.e. we should have double vision. It is not 
having two eyes that creates a problem for Wigan's theory (even with one eye 
closed, the residual eye has two hemiretinae projecting fibres to area 17 of both 
left and right hemispheres); nor does it matter that there should be perfect 
‘unison and correspondence’ between the two percepts: two ts not one and that 
is what mental duality requires. The real problem is that Wigan couched his 
theory in terms of one person having two minds, which if understood literally 
entails our having, e.g., double vision, though in fact we experience nothing of 
the kind. 


5 RESTATEMENT OF WIGAN'S THEORY 


Can the theory be saved nonetheless? I believe it can. Suppose we make a 
distinction Wigan did not make, between the human being qua individual 
organism, and the human being qua person. i.e. complex minded entity. We are 
then in a position to restate the theory of mental duality as follows: the 
individual human organism, having two brains, is the biological substrate of 
two persons, each of which has one mind. In that case, there will be, e.g., double 
vision at the level of the organism, but each of the two persons will experience 
only single vision because, while each cerebrum receives input from the 
contralateral hemisphere about ipsilateral body space, neither cerebrum has 
introspective access to the conscious contents of the other. 

Why not? An answer may be devined from evolutionary considerations. All 
vertebrate species have evolved two neural ganglia, one to each side of the 
anterior portion of a single neuraxis. Given this pattern of development, it 
comes as no surprise that all such species also evolved commissural 
connections between the two ganglia, since otherwise (given decussation of 
sensory and motor nerve tracts, with the exception of olfactory fibres) each 
ganglion would be ignorant of what is going on in ipsilateral body space. But it 


142 Roland Puccetti 


would be equally important for such species that consciousness not span the 
two cerebra, Le. that there not be unity of consciousness within the whole 
cranium, to avoid subjective double mindedness: imagine the effect, for ' 
example, on our arboreal primate ancestors of seeing two branches out there 
side-by-side, when tn reality there is only one! This means that in more highly 
evolved vertebrate species like our own, the function of the corpus callosum is 
not to integrate, but rather to duplicate the contents of conscious experience 
for the mutual benefit of the two cerebra. 


6 CURRENT MISCONCEPTIONS OF DOUBLE MINDEDNESS 


It is failure to make this distinction between dual mentality at the level of the 
organism, on the one hand, and subjectively experienced double mindedness 
on the other, which motivates much of the criticism directed at the theory. In 
her recent book, for example, Patricia Churchland decribes the theory as a 
claim that ‘everyone has two minds (180)’. Now what is meant by ‘everyone’ 
here? If she means every normal human organism, that is correct. If, however, 
she means every complexly minded entity, or person, that is not what the 
present restated version of Wigan’s theory maintains. She then comments that 
the theory entails ‘believing that each skull houses two persons/minds and 
that one is not, so to speak, oneself (Ibid.)’. But why should not each person of 
the pair, with but one mind, be ‘oneself’ to him/herself? Each would have a 
single stream of consciousness, even though its conscious content concerning 
ipsilateral body space is being relayed to it from the contralateral cerebral 
companion. The fact is that nothing one experiences in such a state would be 
any different from what one experiences at present; it is only when confronted 
with a disconnection syndrome that the original duality is revealed. 


7 THE EVIDENCE FOR DISGUISED TRANSCALLOSAL INHIBITION 


If this were all that could be said for the theory, it would appear to be largely a 
verbal issue about how best to describe the organization of consciousness in 
the human brain. But in fact there is more to recommend it: namely what ts 
hinted at in Wigan’s proposition 3 (that each cerebrum is capable of distinct 
and often opposing volitions), and in his proposition 4 (that in the healthy 
brain dominance of one cerebrum over the other prevents this conflict being 
manifested in behaviour). For hemispheric disconnection, whether by thera- 
peutic surgery or natural lesion, not only blocks information transfer between 
the two cerebra; it also disrupts transcallosal inhibition, thereby freeing the 
nondominant (usually the right) hemisphere to undertake independent 
3 Although it may well be the case that duality is known to the mute right hemisphere of every 


normal human being, simply because it knows it is not generating the linguistic behaviour it 
observes emanating from its own body. 


Two Brains, Two Minds? Wigan's Theory of Mental Duality 143 


actions involving the contralateral (usually the left) body side, actions that 
surprise and perplex the dominant or speaking hemisphere. 

Following are some examples of resulting intermanual conflicts and the 
'alien hand' phenomenon, drawn from Bogen's [1985] authoritative review of 
the disconnection syndrome: 


The most interesting finding in the entire examination [of a patient presenting 
with a disconnection syndrome] is the frequent occurrence of well-coordinated 
movements of the left arm which are at cross purposes with whatever else is 
going on. These sometimes seem to occur spontaneously, but on other occasions 
are clearly in conflict with the behavior of the right arm. For example, when 
attempting a Jendrassic reinforcement, the patient reached with his right hand to 
hold his left, but the left hand actually pushed his right hand away. While testing 
finger-to-nose test (with the patient sitting), his left hand suddenly started 
slapping his chest like Tarzan (312-13). 


While doing the block test unimanually with his right hand [another patient's] 
left hand came up from beneath the table and was reaching for the blocks when 
he slapped it with his right hand and said, "That will keep it quiet for a while 
(313)". 


The patient may say, when the left hand makes some choice among objects, “My 
hand did that", rather than taking the responsibility. A patient was described [by 
an earlier investigator] as saying, "Now you want me to put my left index finger 
on my nose”. She then put that finger into her mouth and said, "That's funny; 
why won't it go up to my nose (313-14)?" 


Bogen's own summary of the disconnection syndrome, after decades of 
clinical experience, is the following: 


Split-brain patients soon accept the idea that they have capacities of which they 
are not conscious, such as retrieval [with the left hand] of objects not nameable. 
They may quickly rationalize such acts, sometimes in a transparently erroneous 
way. But even many years after operation, the patients will occasionally be quite 
surprised when some well-coordinated or obviously well-tnformed act has just 
been carried out by the left hand (314). 


Of course when Bogen speaks here of ‘the patient’ having capacities of which 
he is not conscious, he means the person based in the speaking hemisphere; we 
have no good reason to doubt that the person based in the mute, right 
hemisphere and controlling the left hand retrieves objects consciously. 
Stmilarly, it is the dominant left hemisphere-based person who is surprised 
when the left hand carries out a well-informed or well-coordinated act; the 
right or nondominant hemisphere-based person is not surprised, since he or 
she initiated that act. 


144 Roland Puccetti 


8 CONCLUSION 


Those sceptics about mental duality who give the safe rejoinder that it is 
ablation of the forebrain commissures which creates double mindedness in 
patients manifesting the disconnection syndrome, so that its symptoms tell us 
nothing about the organization of consciousness in normals, must ask 
themselves how it is possible one half of a formerly healthy (except for 
proneness to epileptic seizures in many cases) brain—the half that does not 
even think in language—should be able to perform transparently intentional 
and purposeful acts like repeatedly throwing a newspaper to the floor during 
television commercials (the right hemisphere cannot read, and the newspaper 
does block its view of the TV screen); or slapping one’s face when it is time to get 
up and have breakfast.* Such acts require, at the very least, memory, foresight, 
and antictpation of probable responses from the dominant, speaking hemi- 
sphere. Did these characteristically human psychological capacities arise de 
novo from the disconnection, or were they not previously present but 
suppressed by the unconscious inhibitory influence of the speech hemisphere, 
acting through the intact corpus callosum? The latter seem a far more 
parsimonious explanation of the behaviour Bogen has described. 

I conclude that just as one mind does not become two upon hemispheric 
disconnection, so two minds do not become one upon maturation and 
myelinization of the commissural fibres linking together the two brains in our 
heads. Wigan may have not proved his theory, but he was on the right track. 


Department of Philosophy 
Dalhousie University 


* The latter two examples were originally supplied to me by the late Stuart Dimond (personal 
communication, [1978]). 


REFERENCES 


Bocen, J. E. [1985]: ‘The Callosal Syndromes’ Clinical Neuropsychology, Second Edition, 
Oxford University Press, pp. 295-338. 

CHURCHLAND, P. S. [1986]: Neurophilosophy. MIT Press/Bradford Books. 

WIGAN, A. L. [1844, 1985]: The Duality of the Mind. Joseph Simon. 


A 


Brit. J. Phil. Sci. 40 (1989), 145-154 Printed in Great Britain 


Structural Analogies Between 
Physical Systems 


PETER KROES 


ABSTRACT 


Structural analogies between physical laws have received considerable attention from 
philosophers of science. This paper, however, focusses on structural analogies between 
physical systems; this type of analogy plays an important role in the physical and 
technological sciences. A formal, set-theoretic description of structural analogies 
between physical systems is presented, and it is shown that a structural analogy 
between systems does not require a structural analogy with regard to the laws 
involved, nor conversely. 


1 Introduction 

2 Structural Analogies Between Physical Systems 
3 Set-theoretic Description of Structural Analogies 
4 Discussion 


I INTRODUCTION 


Analogies play a prominent role in science (see e.g. Shive & Weber [1982]). 
The use of analogies is widespread, not only in science education, but also in 
actual scientific practice, for instance in the design of experiments, the 
interpretation of experimental results and in theoretical explanations. The 
analogies employed in various contexts may be quite different in nature, 
ranging from (vague) metaphors to analogies based upon precisely defined 
mathematical similarities in the laws governing different kinds of systems. 
Therefore, it is no wonder that analogies perform different kinds of functions 
within science. The metaphoric use of analogies appears to be very significant 
within the context of concept formation in science; Rothbart [1984] has 
argued that ‘metaphoric concept formation is an essential aspect of scientific 
reasoning for the purpose of solving conceptual problems’ (p. 595). Compared 
with this, the use of analogies based on similarities in the mathematical form of 
laws is completely different. This kind of analogy is exploited in, e.g., simulation 
techniques, modelling and analogue computers. 

In spite of the pervasive use of analogies in scientific practice, there has been 


146 Peter Kroes 


among philosophers of science, especially among (logical) positivists, a strong 
tendency to play down the significance of analogies In their analyses of science. 
According to Hempel, analogies play no essential role in, for instance, scientific 
explanations; in his classic study Aspects of scientific explanation [1965] he 
writes: ‘For the systematic purposes of scientific explanation, reliance on 
analogies is thus inessential and can always be dispensed with’ (p. 439). 
According to him analogies are not relevant for a logical or epistemic analysis 
of science; they are an aspect of the pragmatics of science. This position is of 
course a direct consequence of the positivistic view that scientific descriptions 
ought to be understood in a literal sense. So it is not surprising that within the 
logical positivist tradition, in spite of its orientation towards a logical analysis 
of sclence, no attempts were made to arrive at a formal analysis and 
characterization of analogies. 

The tdea that analogies are indispensable in sclence was strongly defended 
by Campbell in his Physics, the Elements [1920]. For him analogies ‘are an 
utterly essential part of theories, without which theories would be completely 
valueless and unworthy of the name' (quoted in Hesse [1966], p. 4). In more 
recent years, Mary Hesse has drawn attention to the use of analogies and 
models in science; she is convinced that there is some element of truth in 
Campbell's position. Hesse's study, Models and analogtes in science [1966], is still 
one of the best studies in this fleld, but it does not contain a systematic analysis 
of the formal properties of analogies. 

We shall not discuss the fundamental issue whether or not analogies and 
models based upon analogies have a real epistemic function in science. 
Whatever attitude philosophers of science have taken with regard to this 
question, interest in a formal analysis of analogies has on the whole been very 
low. One of the few studies in this direction is Bochenski [1962] which 
analyses analogies from the point of view of mathematical logic. A formal 
classification of the different types of analogies used in the technological 
Sciences is presented in Sarlemijn and Kroes [1988] and Sarlemijn [1987]. In 
this paper we shall put forward a set-theoretic description of a specific kind of 
analogy, called structural analogies, and shall briefly discuss their methodologi- 
cal role in the physical and technological sciences. In our approach, the 
emphasis is not, as is usual, on structural analogies between the laws 
governing different types of physical systems, but on structural analogies 
between physical systems themselves. The set-theoretic description is the result 
of a straightforward application of the notion of structure as defined by 
Bourbaki [1966]. We shall first elucidate the notion of structural analogy. 


2 STRUCTURAL ANALOGIES BETWEEN PHYSICAL SYSTEMS 


Analogies are based upon (partial) similarities between different kinds of 
physical systems or between the laws governing these systems. These 


nv 


Structural Analogies Between Physical Systems 147 


similarities may be of a substantive (material) or a formal (mathematical) nature 
(Nagel [1971], p. 110; Hesse [1966], p. 68). In the first case, a particular kind 
of physical system S1 is taken to be a model for the description of another kind 
of system S2, whereby it is supposed that certain properties of S2 (or of the 
elements of $2) are Identical to properties of $1 (or ofthe elements of $1). This is 
for instance the case with the analogy employed in the kinetic theory of gases: 
the molecules of a gas are represented as small billiard balls which have the 
same properties as real billiard balls, such as having a spherical form and 
colliding elastically. In the second case, there exists a similarity in the formal 
(mathematical) relations holding between the elements or properties of system 
S1 on the one hand, and system $2 on the other. So there need not be any 
similarity at the level of the elements or properties themselves. A famous 
example of a formal analogy is the one between gravitational processes and 
processes involving heat conduction (see Maxwell [1965], p. 157): 


The laws of the conduction of heat in uniform media appear at first sight among 
the most different in their physical relations from those relating to attractions. 
The quantities which enter into them are temperature, flow of heat, conductivity. 
The word forceis foreign to the subject. Yet we find that the mathematical laws of 
the uniform motion of heat in homogeneous media are identical in form with 
those of attractions varying inversely as the square ofthe distance. We have only 
to substitute source of heat for centre of attraction, flow of heat for accelerating effect of 
attraction at any point, and temperature for potential, and the solution of a problem 
in attractions is transformed into that of a problem in heat. 


From a purely mathematical point of view, therefore, there is no difference 
between one type of phenomena, heat flow, and the other, gravitation. 

Structural analogies are of the formal type. Roughly, two physical systems 
S1 and S2 are called structurally analogous if the mathematical equations 
describing the behaviour of the two systems have the same form. The exact 
meaning of the expression same form will be given below. At first sight, this 
notion of structural analogy seems to be almost identical to Hempel's notion of 
nomic isomorphism which is a ‘syntactic isomorphism between two correspond- 
ing sets of laws’ (Hempel [1965], p. 436). According to Hempel two sets of laws 
are syntactically isomorphic if 


... the empirical terms (ie., those which are not logical or mathematical) 
occurring in the first set of laws can be matched, one by one, with those of the 
second set in such a way that if in one of the laws of the first set each term 1s 
replaced by its counterpart, a law of the second set is obtained; and vice versa. 
(ibid.). 


However, in one important respect the notion of structural analogy differs 
from nomic isomorphism. Nomic isomorphism refers to a similarity in the laws 
governing two systems, whereas structural analogy refers to a similarity 
between the mathematical equations describing the evolution of different 


148 Peter Kroes 


physical systems. These mathematical equations need not be identical to the 
laws governing these systems. The mathematical equations describing the 
evolution of systems are the result of applying the laws of nature to specific 
situations, characterized by specific initial and boundary conditions. This 
means that the form of the equations of motion of a system may be quite 
. different from the form of the laws governing the system. In other words, given 
two systems S1 and $2, nomic isomorphism between S1 and S2 does not 
necessarily imply a structural analogy between S1 and $2, nor conversely. 
Let us consider more closely Hempel’s example of nomic isomorphism 
between Ohm's law and Poiseulle’s law, describing respectively the flow of 
electric current through a wire and the flow of a fluid through a pipe: 


y-rci 
P-J-C2 


with V: the potential difference, I: the electric current, C1: the resistance of the 
wire, P: the pressure difference, J: the fluid current, and C2: the resistance ofthe 
pipe. Clearly, these laws have the same form, or are, in Hempel's terminology, 
isomorphic. But this nomic isomorphism does not guarantee the existence of a 
structural analogy between any two systems, one of which is governed by 
Ohm's law, the other by Poiseulle's law. For instance, systems S1 and $2 of 
Figure 1 are structurally analogous, but not systems $1 and $3. Although 
systems $1 and $3 are governed by isomorphic laws, they are not structurally 
isomorphic because the constitution of the two systems is different. This is 
immediately evident from the fact that the number of independent variables in 
the two cases is different. It is therefore impossible to simulate on system S1 the 
behaviour of S3. 


Uu 
I J Ji 
—— —— —— 
Vi E V, Pp e P, 
S1 S2 | 
—— 
FIGURE 1. S3 


Conversely, two systems which are structurally analogous need not be 
governed by isomorphic laws. This is for instance the case with the mechanical 
and electric harmonic oscillators of Figure 2. The mathematical equations 
describing their behaviour have the same form: 


md?x/dt? + kx 0 
Ld?I/dt* - (1/0 —0 


Structural Analogies Between Physical Systems 149 


The laws governing these systems are, however, not isomorphic: the laws of 
classical mechanics do not have the same form as the laws of electricity 
(electromagnetism). 


Z 


FIGURE 2. 


Summarizing, in considering structural analogies between physical systems, 
we not only take account of the (possible similarities between the) laws, but 
also of the constitution of the systems involved. 


3 SET-THEORETIC DESCRIPTION OF STRUCTURAL ANALOGIES 


The general idea behind the following set-theoretic description of structural 
analogies is that physical systems will behave in similar ways if the 
mathematical equations describing their behaviour have the same mathema- 
tical structure. In other words, two physical systems are structurally 
analogous if they are two different physical realizations of the same mathematical 
structure. In that case an isomorphism exists between the mathematical 
structures characterizing the two physical systems. 

Now, in order to formalize these ideas, we shall make use of simplified 
versions of Bourbaki's definitions of structure and isomorphism (Bourbaki, 
[1966]). Roughly a structure is considered to be a packet of relations on a given 
set. 

We shall first introduce some necessary concepts. Let E1, ..., En denote a 
family of sets, called the basis. From this basis we can construct a whole array 
of new sets, called the scala on the basis E1, ..., En, by using the Cartestan 
product operator x and the powerset operator P. Let S(E1,..., En) denote a 
particular element from this scala. The relation T(E1, . . ., En, s); -seS(El1, . . ., 
En) is called a typification of s. Given these notions, the definitions of 
structuretype T and structure O by Bourbaki run as follows: 


—"fake a basis E1,.. ., Er. 

——Choose an element from the scala, S(E1, ..., En), and let s be an element 
from S(E1, . . ., En); in other words, give a typification of s: T(E1, . . ., En, s); 
T(E1, .. ., En, s) is called the typical characterization of the structuretype T. 


150 Peter Kroes 


—Define on S(E1,..., En) a number of relations R; (i: 1,..., m); the 
intersection T of these relations is called the structuretype T: 


i-m 


T= (AR 
ic} 


The R; are called the axioms of the structuretype T. 
—Finally, an element O from T is called a structure of the type T on the basis 
El,..., En. 


Thus, the main features of a structure O are determined by (i) the typical 
characterization, and (ii) the axioms of the structuretype. 

Consider, as an example, the structuretype of order. In that case, the basis 
consists of just one set, say A, and the typical characterization is given by 
seP(A x A). Each s therefore corresponds to a certain binary relation K on the 
set (A x A). Now, K is an order relation, i.e. defines an order structure on A, if it 
satisfies certain axioms, namely Ax1: K must be transitive, and Ax2: K must be 
antisymmetric (Ax1 and Ax2 imply that K is also reflexive). The intersection of 
these axioms on P(A x A) defines the structuretype ‘order’. This intersection 
contains all seP(A x A) for which the corresponding binary relations on (A x A) 
satisfy both axioms. An element from this intersection defines an order 
structure on the basis A. 

Two structures O and O’, defined on different bases, E1,. . ., En respectively 
ET’, ... En’, are isomorphic if there exists a family of one-to-one mappings 
fl,...,fnfrom El,...,Ento E1’,..., En’ such that under these mappings O is 
transformed tnto O'; F:=(f1,..., fn) is called an isomorphism. Thus, if 
seS(B1,..., En) satisfies the axioms of structure O, then s’=F(s)eS(E1’,..., 
En’) satisfies the corresponding axioms of structure O’. In other words, an 
isomorphism is a one-to-one mapping which leaves the structural relations 
defined by the axioms invariant. 

Let us now analyse the rather simple structural analogy between systems S1 
and S2 of Figure 1 in terms of these notions of structure and isomorphism. 
Assume that system S1 obeys Ohm's law and $2 Poiseulle’s law. We shall first 
consider the situation in which the properties and dimensions of the wire and 
the pipe (and the properties of the fluid passing through the pipe) are fixed, i.e., 
the situation in which C1 and C2 are constants. Then the configuration space 
X of S1 is a three-dimensional space spanned by the three axes corresponding 
to the variables V1, V2 and I. Of course, these variables are not independent; 
they are related to each other through Ohm's law. This means that the 
collection of states which are ‘accessible’ to system S1, is a subset of the 
configuration space X. In fact, for each value of C1 the subset of accessible 
states is a plane passing through the origin of X. Thus, C1 functions as a 
parameter in configuration space. This subset of accessible states in configu- 
ration space contains all the information about the behaviour of system 51. In 


~ 


Structural Analogies Between Physical Systems ISI 
TABLE 1 
S1 $2 


Al: [O, oo) B1: [0, oo) 
42: [0, oo) B2: [0, oo) 


A3: (— o0, +00) B3: (— oo, +œ) 
A4: [0, oo) B4: [0, co) 





a similar way, all the information about the behaviour of system $2 is 
contained in the subset of accessible states in its configuration space Y, which 
is also three-dimenslonal (P1, P2 and J). It is the structure of these subsets that 
we are interested in. 

Given the above definitions, the structures of these subsets can easily be 
described in the following way. The values of the physical quantities V1, V2, I 
and the parameter C1 (respectively P1, P2, J and C2) are elements of certain 
sets A1, A2, A3 and A4 (B1, B2, B3 and B4). The elements of these sets are real 
numbers multiplied by a physical dimension. Since these physical dimensions 
are not of interest for a mathematical description of the structures of S1, and 
$2, we shall ignore them. The sets, all subsets of the reals, are given in Table 1. 
Note that the values of the electric potentials V1 and V2 range from zero to plus 
infinity and not, as is usually assumed, from minus to plus infinity. Since only 
differences in potentials are physically relevant, this choice of B1 and B2 1s 
allowed. 

The sets Al,..., A4 constitute the basis for the construction of a scala in 
case of system S1, and B1, ..., B4 for the scala in case of S2: 


S1-basis: Al, A2, A3, A4 
S2-basis: B1, B2, B3, B4 


From these scala’s we select the sets M=P(Al xA2xA3xA4) and 
M’ =P(B1 x B2 x B3 x B4). This choice determines the typical characteriza- 
tion of the structuretype of the two systems. This choice is rather obvious: each 
element from M (M’) corresponds to a relation between the variables and 
parameter of system S1 (S2). Now we still have to lay down the axioms 
defining the structuretype of the systems; these axioms must be such that they 
select those elements from M (M) that satisfy Ohm's law (Poiseulle’s law). 
These axioms are: 


System S1 

Ax;: choose all elements s from M such that for each s all (a1, a2, a3, a4)es 
have the same value for a4 (1.e., choose a fixed value for C1), 

Axı: choose all elements s from M such that for all (a1, a2, a3, a4)es it is true 


152 Peter Kroes 


that (a1 —a2) —a3-a4 (i.e., the variables V1, V2, I and the parameter C1 
have to satisfy Ohm's law). 


System S2 

Axi: choose all elements s' from M’ such that for each s' all (b1, b2, b3, b4)es' 
have the same value for b4 (i.e., choose a fixed value for C2). 

Ax;: choose all s' from M’ such that for all (b1, b2, b3, b4)es' it is true that 
(b1—b2) — b3*b4 (i.e., the variables P1, P2, J and the parameter C2 must 
satisfy Poiseulle’s law). 


These axioms determine the structuretype of the two systems. By selecting a 
particular value a for A4 (i.e. for the resistance of the wire) and b for B4 (the 
resistance of the pipe) we get the mathematical structures corresponding to 
systems $1 and $2. As expected, their structures turn out to be isomorphic. 
This follows immediately from the existence of the following one-to-one 
mappings: : 


fl: A1l----Bl;al-bl 
f2: A2 ---> B2; a2=b2 
f3: A3 ---> B3; a3=b3 
f4: A4 --- B4; fA(a) - b and f4 one-to-one. 


These mappings leave the axioms invariant. 

The foregoing analysis immediately shows why systems S1 and $3 of Figure 
1 are not structurally analogous. Because system $3 possesses one more 
independent variable than $1, the basis and the typification of the structure of 
$3 will be different. The structures corresponding with the two systems will 
therefore not be isomorphic, notwithstanding that the laws governing these 
systems are isomorphic (in Hempel's sense). 

So far we have assumed that C1 and C2 are constants. This means, among 
other things, that the dimensions of the wire and the pipe are fixed. What 
happens if these dimensions become variable? Do systems S1 and S2 remain 
structurally analogous or not? The answer depends on how C1 and C2 are 
related to the dimensions of the wire and the pipe. Let I be the length of the 
wire, rits radius and q the specific resistance of the material of which the wire is 
made, and let L be the length of the pipe, R its radius and s the viscosity of the 
fluid. Then the following relations are valid: 


C1 =(Iq)/(x r°) 
C2 — (8Ls)/(x R*) 


Thus, C1 is proportional to the length of the wire and C2 to the length of the 
pipe. From this follows that if the length of the wire in system S1 and the length 
of the pipe in system $2 are treated as variables, then the mathematical 
equations describing the behaviour of the two systems still have the same form. 
In other words, both systems remain structurally analogous (see also Hempel 


X 


Structural Analogies Between Physical Systems 153 


[1965], p. 435). The same conclusion follows if the specific resistance of the 
material of the wire and the viscosity of the fluid are treated as independent 
variables. 

The structural analogy, however, no longer holds as soon as the radius of 
the wire and of the pipe become variable. The two systems do not behave in 
similar ways under variations of r and R. In this case the typifications of the 
structures of S1 and $2 still look very much alike, but the final structures are 
not isomorphic because one pair of axioms is different (the one expressing the 
relations between the variables involved). 


4 DISCUSSION 


In science, especially in the technological sciences, structural analogies 
between systems are often more important than structural analogies between 
laws. Within the technological sciences there is a whole body of knowledge, 
known as ‘theory of similitude’ or 'Aehnlichkeitstheorie' (see, e.g., Murphy 
[1950], Szücs [1980], Pawlowski [1971], Langhaar [1965] which is 
focussed on the analysis of structural analogies between technologically 
relevant systems. Engineers skilfully exploit structural analogies for the 
construction of physical models of systems which are not available for direct 
experimental investigation (such as models of aeroplanes or of rivers). Murphy 
([1950], p. 57) describes a physical model as a ‘device which ts so related to a 
physical system that observations on the model may be used to predict 
accurately the performance of the physical system in the desired respect'. This 

. description captures the main methodological role of structural analogies 
between systems in the technological and physical sciences, namely the 
simulation of systems and processes. 

From a methodological point of view the primary function of structural 
analogies is that they enable the scientist or the engineer to construct 
substitutes for systems which, for whatever reasons, are not available for direct 
investigation and experimentation. These substitutes may be of the same 
physical nature as the system for which they stand model, as with scale 
models, or they may be of a completely different physical constitution, as, e.g., 
in the case of the electrical simulation of an acoustical system. In the first case, 
the two systems obey, of course, the same physical laws; in the second, these 
laws do not necessarily have to be isomorphic. A very well-known example of 
the second kind of substitute is the analogue computer; the operation of the 
analogue computer is based on structural analogies: an electrical network is 
built whose structure is identical to the structure of the system to be 
investigated. 

Closely related to the use of structural analogies is the use of dimensional 
analysis in the physical and technological sciences. In general, dimensional 
analysis is applied in situations where the relevant physical variables are 


154 Peter Kroes 


known, but where the laws and the (differential) equations describing a system 
are not known. In those cases, experimental results obtained on physical scale 
models can nevertheless often be evaluated with regard to the original, full- 
scale system, by use of relationships between dimensionless constants (Sziics 
[1980], p. 101). Here the emphasis is put not on an isomorphism between the 
laws governing the two systems involved (these laws are, although unknown, 
the same), but on structural similarities between systems which are related to 
each other by scale transformations with regard to the relevant variables. 

In practice, finally, the use of structural analogies for simulating the 
behaviour of systems may encounter serious problems. Idealizations, for 
tnstance, may harm the validity of analogical reasoning, particularly in those 
cases where different idealizations are employed in the description of 
analogical systems. From a methodological point of view, however, little is 
known about the different kinds of idealizations that are involved in structural 
analogies, and about how they affect the validity of structural analogies. This 
is still a topic of further research. 


Technical University Eindhoven/ University of Nijmegen 
The Netherlands 
REFERENCES 


BOcHENSKI, I. M. [1962]: ‘On Analogy’ in A. Menne (ed.), Logico-Philosophical Studies, 
Retdel, Dordrecht, pp. 97-117. 

Boursakt, N. [1966]: Eléments de mathématiques; Theorie des ensembles, Fascicule XXH, 
Ch. 4, Hermann, Paris. 

Hemet, C. G. [1965]: Aspects of scientific explanation, The Free Press, New York. 

Hzssz, M. B. [1966]: Models and analogies in science, University of Notre Dame Press, 
Notre Dame, Indiana. 

Langhaar, H. L. [1965]: Dimensional analysis and theory of models, John Wiley & Sons, 
New York. 

MAXWELL, J. C. [1965]: The scientific papers of James Clerk Maxwell, ed. W. D. Scriven, 
Dover Publ., New York. 

Murpry, G. [1950]: Stmilitude and Engineering, The Ronald Press Co., New York. 

NAGEL, E. [1971]: The structure of science, Routledge & Kegan Paul, London. 

PAWLOWSKI, J. [1971]: Die Aehnlichkeitstheorie in der physikalischtechnischen Forschung, 
Grundlagen und Anwendung, Springer-Verlag, Berlin. 

RorHBART, D. [1984]: ‘The semantics of metaphor and the structure of sclence', 
Philosophy of Science 51, pp. 595-615. 

SARLEMIJN, A. and Knoss, P. [1988]: "Technological analogtes and their logical nature’, 
in P. T. Dublin (ed.), Philosophy & Technology 4: Technology & Contemporary Life, 
Reidel, Dordrecht, pp. 237-55. 

SARLEMITN, A. [1987]: ‘Analogy analysis and transistor research’, Methodology and 
Science, Vol. 20, no. 3. 

SHIVE, J. N. and WEBER, R. L. [1982]: Similarities in physics, Adam Hilger Ltd., Bristol. 

Szücs, E. [1980]: Stmilitude and Modelling, Fundamental Studies in Engineering 2, 
Elsevier Scientific Publ. Co., Amsterdam. 


Brit. J. Phil. Sct. 40 (1989), 155-166 Printed in Great Britain 


Frege, Informative Identities, 
and Logicism 


PETER MILNE 


Frege’s Grundgesetze der Arithmetik are inconsistent, as Russell first caused to be 
observed. My purpose here is to demonstrate that what Frege has to say about 
them ts also inconsistent. Specifically, there is no Fregean way to reconcile the 
claims that arithmetical identity statements can be informative because they 
contain singular terms possessing co-referential but distinct senses, and that 
arithmetical truth is logical truth. Gregory Currie (Currie [1982]) has noticed 
this incompatibility and presumably regards it as a fairly obvious one, for he 
offers little in the way of detailed argument. Christian Thiel (Thiel [1968]) also 
realized that there is a problem here. Unfortunately his main supporting 
evidence has been dismissed by Michael Dummett. Since the incompatibility is 
not widely acknowledged and is therefore not so obvious to others it seems 
worthwhile filling in the details. 

The laws of logic are for Frege analytic by the definition of that term. More 
importantly for present concerns, they are always and everywhere true. Yet a 
simple arithmetical truth such as ‘2+ 2 —4' is conceivably false: the senses of 
the terms on either side of the equality-sign differ and hence the thought 
expressed by the whole equation is not true in virtue of the identity of the 
senses of its part.! Frege would have us believe that all arithmetical truths are 
derivable by truth-preserving rules of inference from his basic laws of logic. 
Evidently matters are less than transparently clear and a likely source of 
disquiet is the claim that the senses of ‘2+ 2’ and of ‘4’ are different: we shall 
see that it proves incompatible with the thesis that arithmetic is derivable from 
logic. 

For Frege logic has truth for its subject matter: 


What are often called laws of thought, namely laws in accordance with which 
judging, at least in normal cases, takes place, can be nothing but laws for holding 
something to be true, not laws of truth. If a man holds something to be true—and 
the psychological logicians will surely hold that their own statements at least are 
true—he thereby acknowledges that there is such a thing as something’s being 
true. But in that case it is surely probable that there will be laws of truth as well, 
and if there are, these must provide the norm for holding something to be true. 
And these will be the laws of logic proper.” 


! Frege [1893/1903], p. 35. ? Frege [1897]. p. 146. 


156 Peter Milne 


I understand by ‘laws of logic’ not psychological laws of takings-to-be-true, but 
laws of truth? 


Logic is concerned with the laws of truth, not with the laws of holding something 
to be true, not with the question of how men think, but with the question of how 
they must think if they are not to miss the truth.’ 


[The extensionalist logicians] are right when they show by their preference for 
the extension, as against the intension, of a concept that they regard the 
reference and not the sense of words as the essential thing for logic. [. . .] [The 
intensionalist logicians] forget that logic is not concerned with how thoughts, 
regardless of truth-value, follow from thoughts, that the step from thought to 
truth-value—more generally, the step from sense to reference—has to be taken. 
They forget that the laws of the logic are first and foremost laws in the realm of 
references and only relate indirectly to sense." 


Frege places logic squarely within the realm of reference but sense 
determines reference; an expression only refers via its sense, whether that 
expression be proper-name, predicate, or sentence (itself a species of proper- 
name). The thought expressed by a sentence yields truth conditions for that 
sentence, for sense determines reference, which in this case is truth-value. As 


Frege says: 


[N]ot only a denotation, but also a sense, appertains to all names correctly 
formed from our signs. Every such name of a truth-value expresses a sense, a 
thought. Namely, by our stipulations [concerning the eight primitive signs of 
Grundgesetze] it 1s determined under what conditions the name denotes the True. 
The sense of this name—the thought—is the thought that these conditions are 
fulfilled.* 


As a perusal of Grundgesetze confirms, Frege introduces his rules of inference 
and shows them to be truth preserving. Similarly with the six basic laws that 
he introduces: he scrupulously argues that the falsity of any is impossible. In 
the case of the ill-fated Axiom V he introduces identity of course-of-values as 
being equivalent, by stipulation, to identity of values for all arguments, in full 
knowledge of the ontological commitments to courses-of-values as objects 
thereby embraced. Extensions of concepts and, more generally, courses-of- 
values are par excellence what Frege called logical objects. The crucial fact is 
that, rightly or wrongly, he thought that all his basic laws are logically true, 
true in all possible circumstances, or perhaps most accurately, true under all 
assignments of truth-values, consonant with his definitions, to the subsenten- 
tial parts. (Frege’s account of quantification is stated substitutionally, hence 
the propriety of assigning truth-values to subsentential expressions.") 

3 Frege [1893/1903], p. 13. * Frege [1897], p. 149. 
5 Frege [1892-5], p. 122, ‘reference’ substituted for ‘meaning’. For remarks similar to the 
foregoing see Frege [1893/1903] pp. 12-13, 15. Frege held to this view of logic; see Frege 


[1918-19], pp. 351-2. 
é Frege [1893/1903], pp. 89-90. ? Cf. Dummett [1981]. pp. 285-6. 


Frege, Informative Identities, and Logicism 157 


We see this if we examine the basic laws. Axiom I is a simple truth of 
propositional logic. In modern notation it may be rendered (a—(b->a)). From 
$18 where he introduces the axiom Frege refers us back to the definition of his 
condition-stroke (our '—') in $12 and asserts that in virtue of that definition 
the axiom could only be false if both a and b were the True while a was not the 
True, which is, as he says, impossible. Axioms II and III are recognizably 
logical to the latter-day reader and are similarly justified in terms of truth- 
value assignments. Axiom IV is less familiar and relies for tts justification on 
Frege's identification of truth-values as the references of sentences. Neverthe- 
less it contains nothing that need detain us here. Axiom VI which states that x 
is identical with the object that is identical to x, or in current notation 
x iy(y —x), is immediately forthcoming from Frege's stipulation regarding 
the interpretation of his substitute for the definite article. The most Interest- 
ing—and puzzling—case is Axiom V. In a notation that is only partly Frege's 
the axiom can be stated thus 


(&f(e) — bg(e)) = (x) (f(x) = g(x)) 


where £f(e) is the course-of-values of the function f(£). It is introduced as a law 
of Begriffsschrift in $20. For the justification we are referred back to $83 and 9, 
the latter informing us that 'an identity of course-of-values may always be 
transformed into the generality of an identity, and conversely'. $9 takes us 
back to $3 where Frege states that he uses ‘the words "the function ®(¢) has 
the same course-of-values as the function V (£)"" generally to denote the same as 
the words "the functions ©(¢} and ‘¥(é) have always the same value for the 
same argument” '. Given that stipulation Axiom V Is indeed true, but that itis a 
logical truth is perhaps less obvious. It must be remembered that for Frege 
courses-of-values are logical objects. A sentence that might otherwise appear 
to make a substantive claim or to be true in virtue of conventionally assigned 
meanings achieves the status of a law of logic because it expresses what about 
the logical objects in question is tmportant for logic. Axiom V expresses the 
identity condition for courses-of-values.? (Ex nihilo nihil fit —Axiom I is only a 
logical truth in virtue of the conventions governing the condition-stroke.) The 
justification of Axiom V does differ from the justifications ofthe other axioms in 
that it does not explicitly mention truth or the True, but this is only a 
superficial difference. In the quotation above from $3 the two expressions are 
said to denote the same, which means that they take the same truth-value. 
Thus, under any assignment of truth-values the left- and right-hand sides of 
Axiom V possess the same truth-value. It may appear at first blush an 
illegitimate jump from identity of truth-value to identity under all truth- 
valuations but as Frege does not expand upon the above-cited explanation the 
stronger reading is the more likely—mere truth scarce suffices for a basic law of 
logic. (The reader must bear in mind that, as Frege emphasizes in $10, Axiom 
8 Cf. Resnik [1980], p. 207. 


158 Peter Milne 


V determines neither the senses nor the references of the expresstons ‘éf(e)’ and 
'tg(e)".) 

Axiom V, we have seen, is a truth of logic. Is it true in virtue of the senses of 
Its sententia] components, the left- and right-hand clauses? Michael Dummett 
concludes, after a careful examination of some grey areas in a Fregean analysis 
of synonymy, that it is not.? Whilst we may acknowledge the import of the 
evidence that he adduces in support of this contention, the preponderant 
evidence 1s against it. Three considerations can be brought to bear. 

First, the textual evidence, in particular the passage from 'Function and 
concept’, cited by Dummett, that asserts sameness of sense of the two clauses.!? 
(This passage and the Grundgesetze attribution of distinct senses to the terms in 
true arithmetic identities suffice for Currie.) Dummett claims that the use of the 
term 'transformation' in Grundgesetze for the statement of identity between the 
two clauses indicates a revision. But it might just refer to the transition from 
one clause to the other in a chain of inferences. Frege takes pains to point out 
that the axiom is employed by logicians, often tacitly.!! It licenses the 
transition between the clauses. The other piece of overt textual evidence, cited 
by Thiel, comes in a letter from Frege to Husserl, dated December 9, 1906, in 
which logical equivalence is proposed as a criterion of sentential synonymy. 
Dummett, perhaps charitably, certainly weakly, suggests that we do not take a 
scholar's private correspondence as an authoritative source.?? 

Second, the passage in $10 in which Frege addresses the fact that Axiom V 
fixes neither the references nor the senses of courses-of-values. That being so, 
an arbitrary one-one function of courses-of-values, say X, could be introduced 
and an identity between its values, i.e. X(Ed(e)) = X(2¥(e)), rather than simply 
between the courses-of-values themselves, would serve equally well in 
satisfying Axiom V. Frege adds in a footnote, despite this new identity having 
the same truth-value as the generalization ‘(x)(®(x) 2 W(x))', that this is not to 
say that their senses are the same. It is tempting to read into this the 
qualification 'in this case'. Why else should he have added the footnote in $10 
and not in $9, or even $3? (It does not count against this point that If by logic 
alone it can be shown that the function X is one-one on courses-of-values then 
the new identity must, under the construal of the basic laws of logic urged 
here, have the same sense as the old identity and as the generalization— 
nothing Frege says contradicts that fact.) Frege's first sentence in $10 says that 
he has laid it down that the two clauses of Axiom V denote the same. This shows 


? Dummett, op. cit. pp. 323-43; see also pp. 456, 530-2. 

10 Frege [1891], pp. 142-3. Like Currie, Hans Sluga considers this statement sufficient to show 
that Axiom V is true tn virtue of the identity in sense of its two main clauses; see Sluga [1980], 
p. 156. 

!! Frege [1893/1903], pp. 3, 44. 

12 Frege [1980], p. 70. Cited Thiel [1968], p. 131. As the letter post-dates Russell's 
communication revealing the deficiency of Axiom V it is far from clear how much weight 
should be given to it in the present context. 

13 Dummett, op. cit., pp. 324-5. 


Frege, Informative Identities, and Logicism 159 


that it is a matter of stipulation that the two clauses have the same sense; 
courses-of-values, whatever they may subsequently be defined to be, are to 
comply with this stipulation. This reading is confirmed by $9, which refers 
explicitly to the stipulation in $3 concerning the generality of an identity and 
the identity of courses-of-values. Frege goes on in $10 to say, rightly, that this 
stipulation does not fix the references of the terms denoting courses-of-values, 
Le. does not determine which objects are courses-of-values. In claiming that 
Axiom V is not true in virtue of an identity of sense between its sentential 
components Dummett has to assume that the sentence expressing the identity 
of courses-of-values has a sense independently of its occurrence in Axiom V, 
and thus a sense prior to the Grundgesetze inquiry. This makes nonsense of 
Frege’s calling the axiom a stipulation, and of the declared aim of §10, which is 
the specification of courses-of-values. From one so scrupulous on questions of 
definition as Frege, such errors are highly unlikely. 

Third, and most important in the light of Frege’s intent, the consequences of 
the supposition that in Axiom V the senses of the two clauses are distinct. Were 
that so Axiom V would, contrary to our reading of $3, be conceivably false, a 
contention with intolerable consequences. In an article published not long 
after the appearance of Grundlagen Frege compares geometry to arithmetic: 


[A]ll arithmetical propositions can be derived from definitions alone using purely 
logical means, and consequently [...] they also must be derived in this way. 
Herewith arithmetic is placed in direct contrast with geometry, which, as surely 
no mathematician will doubt, requires certain axioms peculiar to it where the 
contrary to these axioms—considered from a purely logical point of view—is just 
as possible, i.e. is without contradiction. Of all the reasons that speak in favour of 
this view, I here want to adduce only one based on the extensive applicability of 
mathematical doctrines. As a matter of fact, we can count just about everything 
that can be an object of thought: the ideal as well as the real, concepts as well as 
objects, temporal as well as spatial entities, events as well as bodies, methods as 
well as theorems; even numbers can in their turn be counted. What is required is 
really no more than a certain sharpness of deltmitation, a certain logical 
completeness. From this we may undoubtedly gather at least this much, that the 
basic propositions on which arithmetic is based cannot apply merely to a limited 
area whose peculiarities they express in the way in which the axioms of geometry 
express the peculiarities of what is spatial; rather, these basic propositions must 
extend to everything that can be thought. And surely we are justifled in ascribing 
such extremely general propositions to logic.!* 


The later Frege would not have put the matter quite this way but is there 
reason to suppose that he had given up this view by 1893? I think not. The 
implication is that the truths of arithmetic cannot conceivably be false. Now it 
is possible to maintain that while the truths of arithmetic are not conceivably 
false, the basic laws of logic, from which they are derived, are. But this robs the 


14 Frege [1885], p. 112. See also Frege [1884], p. 21. 


160 Peter Milne 


derivation of its power to explain the necessary truth of arithmetic and is 
certainly not what Frege had in mind. Moreover, if conceivably false, what 
grounds the lawlikeness of Axiom V? The laws of logic are, as we have seen, 
laws of truth—what, if not considerations of sense, grounds their necessity? 
And if it is ungrounded there is a genuine epistemological difficulty for the laws 
of logic are nevertheless taken to be correct: 


[T]here is no such thing as a peculiarly arithmetical mode of inference that 
cannot be reduced to the general inference-modes of logic. If such a reduction 
were not possible for a given mode of inference, the question would Immediately 
arise, what conceptual basis we have for taking it to be correct.!? 


This passage comes from an article published in 1885. Later, in the opening 
pages of his contribution to the controversy with Hilbert over geometry, Frege 
makes a similar assertion: 


Traditionally, what is called an axiom is a thought whose truth is certain 
without, however, being provable by a chain of logical inferences. The laws of 
logic, too, are of this nature.!* 


How do we achieve this certainty? How could we attain it if one can grasp 
the senses of the constituent clauses of Axiom V and yet fail to appreciate 
thereby that the axiom itself is true? Surely its source is none other than the 
proper understanding of the laws themselves, that is, of the senses of the 
subsentential expressions occurring in the laws. 

These three considerations, I believe, sway the balance of evidence in favour 
of the assertion that Axiom V does not differ in status from the other axioms. It 
is necessarily true, that is to say, true in virtue of the identity of the senses of its 
constituent clauses. (In one particular Axiom V does differ from the others—its 
lack of self-evidence. in the 1903 post-script to the second volume of 
Grundgesetze, written in the light of the derivation of Russell's paradox from the 
basic laws, Frege sees this lack as severe and reprehensible. Self-evidence—and 
truth—of axioms was to play an important role in Frege's controversy with 
Hilbert over the nature of axiomatic systems. But self-evidence is not relevant 
to the concerns of the present discussion since it has no bearing on the 
necessary truth or otherwise of the basic laws of the system.) 

And now we are faced squarely with our problem. Rules of inference 
preserve truth—free variables are taken as indicators of generality—and the 
basic laws are all necessarily true. Yet an arithmetical equation such as 
‘2+2=4' is not true in virtue of an identity of the senses of its component 
terms and therefore is conceivably false in just the same way as the basic laws 
are not concelvably false. As sense determines reference a necessarily true 
sentence is so in virtue of the thought it expresses. From the foregoing we see 


15 Frege [1885], p. 113. 
16 Frege [1903], p. 273. Published in 1903 this was presumably written prior to receipt of 
Russell's thunderbolt. 


Frege, Informative Identities, and Logicism I6I 


that the only possible conclusion is that arithmetical identities derived from the 
basic laws are necessarily true, i.e. true in virtue of the identity of the senses of 
the terms involved. Frege’s logicism is incompatible with the thesis that true 
arithmetical identities may be informative. 

One way to rescue Frege would be to construe thoughts and senses in 
general as hyperintensional. That is, to construe identity of thought as 
sufficient but not necessary for identity of truth-value under all possible truth- 
valuations. Correspondingly, for senses this construal leads to identity of the 
senses of the terms involved being sufficient but not necessary for the truth of 
an identity assertion under all truth-valuations. The sense of a subsentential 
expression could no longer be taken as the contribution that that expression 
makes to the truth conditions of the sentences in which it may occur, and for 
sentences difference in thought expressed would not necessarily correspond to 
difference in truth-conditions. 

Dummett has claimed it a virtue of Frege’s notion of sense that it permits this 
gap between Fregean thought and truth-conditions. More exactly, by 
emphasizing the role of sense as a cognitive notion he maintains that Frege, 
unlike later philosophers such as the Tractarian Wittgenstein and the logical 
positivists, could ascribe a cognitive value to analytic statements.!7 To treat 
sense as a cognitive notion is to take the primary understanding of sense to be 
what Frege called mode of presentation. Gareth Evans has explained the mode 
of presentation of a Fregean proper name as a particular way of thinking of the 
object named.!? On this account '2-- 2 —4' is informative because ‘2 +2’ is 
associated with a way of thinking of the number four different from that 
associated with ‘4’. Frege seems to have thought that the mode-of-presen- 
tation and contribution-to-truth-conditions accounts of sense run hand in 
hand. Dummett has argued that they come apart in the case of indexicals.!? Do 
arithmetic terms provide another example? I am inclined to think not. 

To further perspicuity in explaining how I come to that conclusion I shall 
use the following formulations: where A is (an occurrence of) a singular term 
the sense of A is the mode of presentation for A; this is a mode of presentation of 
the object presented. The point of this is that where A and B are co-referring 
terms A“ —"B is informative when the mode of presentation for A differs from 
that for B, although both are modes of presentation of the single object named 
by A and B. ‘2+2=4’ ts informative because the modes of presentation for 
'24- 2' and for ‘4’ differ, or at least that is the claim. If A" —'^B is logically 
contingent there are possible circumstances in which A and B do not co-refer. 
Hence it is possible that the mode of presentation for A should present a 
different object from the mode for B, even 1f, in fact, they do not. However, at 
least according to Frege, '2--2—4' is logically necessary. As we have seen, 
that claim implies that there are no (logically) possible circumstances in which 


17 Dummett [1978], pp. 420-1. 19 Dummett [1981], pp. 83-147..----- 
18 Evans [1982], pp. 14-17. PHR 
` 
Ses WT 3. dh 


[oz e 
1 Nn 
SU i x 
X EN jb 
n ES E Pea fs) j 
fs ` P i ‘ 


162 Peter Milne 


different objects are presented by the modes of presentation for '2 4- 2' and for 
‘4', No predicate can be true of 2+2 and false of 4. There is therefore no 
possible difference to 'pick up on'—in any possible circumstances a mode of 
presentation of 2 + 2 is also a mode of presentation of 4. How is one to divine a 
difference between modes of presentation when in no possible circumstances is 
there a difference in the objects presented? Put another way, there is no 
possible way of thinking of 2 + 2 that would not also be a way of thinking of 4, 
so how can one way of thinking of four be a mode of presentation for 4 and not 
simultaneously a mode of presentation for 2 4- 2? 

The crucial question is this; how are modes of presentation Individuated? 
Certainly not by the objects that they present for no true Identity would be 
informative if that were so. Equally, not by the expressions for which they are 
modes of presentation for then no identity would be uninformative save 
substitution instances of ‘x= x’, and translation would be impossible. Modes of 
presentation have to be sufficiently differentiated for it to be the case that the 
modes for ‘2 + 2' and for ‘4’ differ, while those for ‘deux’ and for ‘zwei’ do not.?° 
One way to construe a mode of presentation is in analogy with seeing an object 
under an aspect. Thus under different aspects different characteristics are 
associated with the object, and where possession of one set of characteristics 
does not entail possession of another it is conceivable that the objects differ. 
The analogue of that route is not open to Frege unless the associated 
characteristics fail to uniquely distinguish the object in question, a suggestion 
that runs contrary to the idea of sense as the determinant of reference. The 
possibility that the sets of characteristics considered can apply to qualitatively 
indistinguishable but numerically distinct objects cannot be countenanced on 
the Fregean view (unless there are conceivable circumstances that are not 
logically possible). 

According to Dummett and Evans certain sentences stating the reference 
(semantic value) of a singular term serve to show its sense.?! So where ‘a=b is 
a contingently true identity 

(1) The reference of ‘a’=a 
serves to show the sense of 'a' whereas 

(2) The reference of 'a' — b, 
although true, does not. Now in the case when 'a- b' is a logically necessary 
identity (2) is immediately deducible from (1) without appeal to further 
empirical facts belonging to the theory of reference for the language in 
question. Logical truths are true independently of empirical facts about 
reference—that ts why not all logically possible worlds are metaphysically 


20 Or are modes of presentation so parochial as not to survive translation? It could be said that all 
that matters for translation is preservation of truth-value in all possible ctrcumstances, not 
preservation of sense. However, the same considerations cannot be allowed to apply to one's 
own idiolect on different occasions of utterance or the hyper-Intenslonal approach is sunk. 

21 See Evans, op. cit., p. 26. 


Frege, Informative Identities, and Logicism 163 


possible—so nothing extra is being smuggled in. I can see no reason why, in 
that case, (2) does not also show the sense of ‘a’. Presumably Dummett can. He 
produces an impressively off-hand dismissal of logic when he says: 


It is analytic that two predicates have the same extension if we are able to prove 
this fact by certain restricted means; but they may, nevertheless, have different 
senses, because our primary means of recognizing that they apply to any given 
object may differ.” 


The ‘certain restricted means’ are the means of Fregean logic. If logical truths 
have cognitive value, Le. informative, why should we single out them in 
particular rather than, say, the Peano—Dedekind axioms for arithmetic? If it is 
because we know them to be true in all conceivable circumstances how do we 
know this if not by means of a proper grasp of the thoughts they express? How 
could Frege confirm, as he says is his intention, that arithmetic is a branch of 
logic and need not borrow any ground of proof whatever from either 
experience or intuition??? But these questions are variations on ones already 
asked above. 

As a matter of psychological fact logical truths can surprise us. A hyper- 
intensional theory of sense can explain this phenomenon and that is in its 
favour. On the other hand, and in contradistinction to the sense-as- 
contribution-to-truth-conditions account, such a theory cannot invoke 
difference of sense alone in order to explain difference in truth-value between 
sentences containing intersubstituted co-referential subsentential expressions, 
forit permits differences of sense that lead to no difference in truth-value under 
any circumstances. That counts against it. The truth-conditional reading can 
account for the psychological fact in the following way. If there is anything in 
Frege's metaphorical depiction of senses as things to be grasped then grasping 
is an action that can be mismanaged—it can be done well, done badly, or not 
done at all. '2 4-2 —4' is informative only to the extent that one has not fully 
grasped the one sense that both the singular terms have. 

Onelast point on hyper-intensional sense. Even if what Frege says about the 
status of arithmetical truths can be rendered consistent on the modes-of- 
presentation view of sense, it is nevertheless worthwhile pointing out that the 
alternative contribution-to-truth-conditions account leads to inconsistency in 
the context of arithmetic, that is, in the context of a precise formal language, not 
only in natural language. 

The Grundlagen notion of logical truth is perhaps less clearly stated. There 
logical laws are basic but the connection with truth is not spelt out in any 
detail. Nevertheless the question as to how arithmetical truths can be 
informative does arise. Frege emphatically declares that analytic judgments 
and chains of inferences can be informative without clearly distinguishing the 


22 Dummett [1978], p. 421. 23 Frege [1893/1903], p. 24. 


164 Peter Milne 


psychological reading of this from the logical: consequences are contained in 
premises and we can discover them by means of logical inferences. That the 
problem takes an especially acute form in the earlier work is due to the fact that 
in 1884 Frege still maintained some aspects of the Begriffsschrift theory of 
identity statements. According to the later ‘On the Concept of Number’, 
published posthumously, if ever it appears that something is asserted of 4 that 
cannot also be asserted of 2+2 then it is the signs themselves that are the 
subject of discussion—they are in that context autonymous—and not the 
single number that both expressions denote.** Now if the identity ‘2-+2=4’ is 
informative then it does appear that something is being asserted of 2 + 2 thatis 
not asserted of 4, for the identity ‘4=4’ is anything but informative. Likewise, 
something is simultaneously asserted of 4 and not of 2 + 2. That being the case 
the equation is to be understood with the signs denoting themselves. But then 
one wonders how Frege could ever have believed any informative identity 
statement to have been a logical truth. That some pair of distinct signs in some 
language refer to the same object is exactly the sort of information that can 
only be given in an analytic judgment if it can be deduced from basic logical 
laws and strictly uninformative definitions. Thus, an arithmetic identity can 
only be informative In the psychological sense, and only to one who has failed 
to realize that it is so deducible. 

Returning to Grundgesetze, we may ask how Frege failed to see the problem 
that he had created. The theory of sense and reference first appears in Frege's 
writings in 1891. He had been engaged on the project that culminated in the 
publication of Grundgesetze, in 1893, for several years. The begriffsschriftlich 
codification of arithmetic had occupied him since at least 1879, as he tells us in 
the Introduction of the 1893 work. Grundlagen had heralded its completion. 
Yet changes in his logical system 'forced [him] to discard an almost completed 
manuscript'. One of these changes is due to the new theory of identity 
statements that is issued In with sense and reference. This account of identity 
was better in accord with mathematical practice, hence its incorporation into 
a revised system of logic could have seemed only to the good. Moreover, the 
emphasis on logic as providing laws of truth, i.e. reference, probably obscured 
the role sense had to play in logical laws. These facts conspired to hide the 
incompatibility between informative arithmetic identitles and logicism, an 
incompatibility that arises in a failure to mesh of separate strands in Frege's 
thought, namely logicism, which dates back at least to 1879, and the theory of 
meaning developed in the early 1890s. 

Currie's explanation is quite different. He sees the incompatibility as 
resulting from a confusion on Frege's part, a confusion of two senses of sense. 
On the one hand there is the weak notion that allows for the difference in sense 
between '2 4- 2' and ‘4’; on the other there is the strong notion that yields the 


24 Frege [1891/2], p. 85. The dating of this piece is suspect. Dummett [1981], p. 606, plausibly 
places it as 1888-90. 


Frege, Informative Identities, and Logicism 165 


identity of sense of the left- and right-hand clauses of Axiom V. Distinguishing 
between these, as Frege failed to do, resolves the impending paradox. It may be 
tempting to equate Currie's weak sense with mode of presentation, his strong 
with contribution to truth conditions. He himself does not suggest this 
equation. I should urge that the temptation be resisted. 

In our present day sophistication we can see that Frege’s axiom V ought 
never to have been accorded the status of a necessary truth. This is not because 
it leads to contradiction; rather it is because the ontological commitments of its 
left- and right-hand sides are so very different. This difference is manifest in the 
difference in logical form. Presumably Frege thought that both clauses 
represent a single thought, indeed that has to be the case if the present account 
of the basic laws of logic is correct. This suggests that Frege's conceptual 
notation is a somewhat imperfect instrument for the representation of Fregean 
thoughts." For example, there are an infinity of tautologies expressible in 
Frege's system that represent the same truth-function: (aa), a—(a—a), 
a (a (a—a)), etc. (Such redundancy is vital to a logical calculus but hardly 
what one expects of the vehicle designed to capture pure thought.) Axiom V at 
once makes a substantive ontological assertion and appears irresistible when it 
is granted that functions have courses-of-values. Perhaps then the fault lies 
not with the axiom Itself but with the assumption that the courses-of-values 
'generated' by the axiom lie within the range of the first-order variables. For 
Frege that assumption is irrecusable: there is only one universe (of discourse)? 
and the only ontological stratification is between concepts and objects. The 
basic laws of logic are best seen not so much as laws of truth but rather as 
foundational equations for a calculus of senses, Le. they tell us not about truth 
itself but how thoughts, our tool for grasping truths, are related.?7 


University of Liverpool 


25 Cf. Dummett, op. cit., pp. 331-2. 
26 On the one universe of discourse see Jean van Helfenoort [1967]. 
77 My thanks to an anonymous referee for helpful comments on an earlier version of this article. 


REFERENCES 


CURRIE, G. [1982]: ‘Frege, Sense and Mathematical Knowledge’, Australasian Journal of 
Philosophy, 60, pp. 5-19. 

Dummett, M. [1978]: "The Social Character of Meaning’, in Truth and Other Engimas, 
London, Duckworth. 

Dummett, M. [1981]; The Interpretation of Frege's Philosophy, London, Duckworth. 

Evans, G. [1982]: The Varieties of Reference, Oxford, Oxford University Press. 

FREGE, G. [1884]: Die Grundlagen der Arithmetik, trans. J. L. Austin as The Foundations of 
Arithmetic (second edition), Oxford, Basil Blackwell, 1953. 

FnEGE, G. [1885]: ‘On Formal Theories of Arithmetic’, Frege [1984], pp. 112-21. 

Frec, G. [1891/2]: ‘On the Concept of Number’, Frege [1979], pp. 72-86. 

FREGE, G. [1892-5]: ‘Comments on Sense and Meaning’, Frege [1979], pp. 118-25. 


Brit. J. Phil. Sci. 39 (1988), 167-181 Printed in Great Britain 


The Autonomy of Probability Theory 
(Notes on Kolmogorov, Rényi, and Popper) 


HUGUES LEBLANC 


ABSTRACT 

Kolmogorov's account in his [1933] of an absolute probability space presupposes 
given a Boolean algebra, and so does Rényi's account in his [1955] and [1964] ofa 
relative probability space. Anxtous to prove probability theory ‘autonomous’, Popper 
supplied in his [1955] and [1957] accounts of probability spaces of which Boolean 
algebras are not and [1957] accounts of probability spaces of which fields are not 
prerequisites but byproducts instead.! I review the accounts in question, showing 
how Popper's issue from and how they differ from Kolmogorov's and Rényi's, and I 
examine on closing Popper's notion of ‘autonomous independence’. So as not to 
interrupt the exposition, I allow myself in the main text but a few proofs, relegating 
others to the Appendix and indicating as I go along where in the literature the rest 
can be found. 


1. By Boolean algebra understand a triple «S, —, ^ >, where Sis a non-empty 
set, — a function from S into S, and ^ one from S x S into S such that 


Al. For any A and B in S, ANB=BnA, 

A2. For any A, B, and C in S, AN(BNC)=(ANMB)AC, 

A3. For any A, B, and C in S, ANB=CnC if, and only if, ANB=A, 
A4. For any A and B in S, if A =B, then A =B, and 

AS. For any A, B, and C in S, if A=B, then ANC=BNC. 


Notes: (a) Constraints A1—A5 are what the literature calls postulates for Boolean 
algebras. The first three are Byrne's in his [1946], and the last two (taken for 
granted by Byrne) are Rosenbloom's in his [1950]. 

(b) = in A1-A5 is of course the identity relation. So ANB and BOA are 
identical members of S by Al, An (Bn C) and (ANB)NC identical members of 
that set by A2, etc.; and the Boolean algebras defined here are Boolean algebras 
with respect to identity. Why I underline this will become clear in 3. 


1 Stnce their Boolean algebras were fields (L.e. Boolean algebras with sets their members, A the 
complement of A, and ANB the intersection of A and B), Kolmogorov and Rényi talked of 
probability fields rather than probability spaces. Kolmogorov would call the probability spaces I 
study here generalized probability spaces, but Suppes in his [1974] calls them finitely additive 
spaces. Footnote 2 has a word about countably additive ones. 





168 Hugues Leblanc 


(c) It is usually required of the set S in a Boolean algebra «S, —, A> that it 
have at least two members. It will follow from each of Kolmogorov's, Rényi's, 
and Popper's accounts of a probability space that S does. 


2. By an absolute probability space in the sense of Kolmogorov understand a pair 
< <S, — n>, P», where 


(i) «S, —, ^> is a Boolean algebra and 
(ti) P is a real-valued unary function on S such that 
K1. For any Ain S, Ox P(A), 
K2. For any A in S, P/AU A) — 1, and 
K3. For any A and B in S, if ANB=CnC for some C in S, then 
P(AUB)=P(A)+P(B). 


Notes: (a) ‘K’ signals that the constraints are essentially Kolmogorov's tn his 
[1933]: and ‘AUB’ is short of course for ‘ANB’. 
(b) Given A1—A5 and K1-K2, this constraint 


K3'. For any A and B in S, P(A) - P(AnB) -- P(AnB) 


is equivalent—as proved in the Appendix— to K3 and may therefore substitute 
for it. It brings Kolmogorov's account closer to Popper's in 32 

(c) = in K2-K3 is the identity relation again; but in K3 it plays a double role, 
with 'AnB-— Cn C' to the effect that ANB and CnC are identical members of S 
but ‘P(AUB)=P(A)+P(B)’ to the effect that P(AUB) and P(A)+P(B) are 
identical reals. = also plays that double role in what I call Leibniz's Rule for 
Absolute Probabilities, to wit: 


LAP. A=B .. P(A)=P(B). 


It is LAP, a triviality of course when — is identity, that permits inferring the 
Permutation Law ‘P(ANB)=P(BnA)’ from Al, the Association Law 
‘P(AN(BAC))=P((ANB)NC)’ from A2, etc. See 3 for more on this. 

(d) The probability spaces just defined are generalizations of those in 
Kolmogorov, where as indicated in footnote 1 the triple <S, —, n> or—for 
short—the set S had to be a field of sets. Here S may consist of whatever there is a 
Boolean algebra of, sets, relations, individuals, propositions, etc. 


? In accounts of countably additive spaces K3 is generalized to read thus: 
For any Aj, Aj .. Ag ... in S, ff Aj, O A, =9 for] #k (k—1, 2, ... n, ...), then 


P(U A) = Y P(A), 
151 AS 


oo 
where ‘@’ is short for, say, ‘Ay NA,’ and ‘\) Ar short for '(.. (A1 UA,)U...)U AK. 
A countable generalization of K3' that is as compact as the above ts very much needed. Note In 
behalf of K3’ that it is not a conditional; it proves as a result far handler than K3 in foundational 
studies. 


The Autonomy of Probability Theory 169 


3. By an absolute probability space in the sense of Popper understand a quadruple 
<S,—, N, P>, where S is a non-empty set, — a function from S into S, ^ one 
from S x S into S, and P a real-valued unary one on S such that that 


AP1=K1 =For any A in S, Ox P(A), 

AP2=K2=For any A in S, P(AVA)=1, 

AP3 =K3'=For any A and B in S, P(A) - P(AnB) - P(AnB), 
AP4. For any A and B in S, P(ANB) <P(BnA), 

APS. For any A, B, and C in S, P/AG (B^ C)) x P((AnB)nC), and 
AP6. For any A in S, P(A) xP(AnA).? 


Notes: (a) With 'A' for 'absolute' and 'P' for 'Popper', 'AP' signals that the 
foregoing constraints are Popper's in his [1955], except for the first two in 
place of which he used 


AP1'. For any A and B in S, P(ANB) € P(A) and 
AP2’. For any A in S there is a B in S such that P(B)z0 and 
P(AMB)=P(A) x P(B).* 


That, given AP3-AP6, API-AP2 are equivalent to API'-AP2' has been 
known for some time, and was surely known to Popper all along. Proof of the 
equivalence will be found in my [1982]. 

(b) As remarked on p. 1, the triple <S, -, ^> or, for short, the set S in 
Popper's account is not presumed to be a Boolean algebra with respect to 
identity; and nowhere in the account is any member of S declared identical 
with another, the way ANB was declared identical with BAA by Al, AN(BNC) 
declared identical with (AQB)nC by A2, etc. Think, however, of the symmetric 
difference 


(AnB) o (BOA) 


of A and B. When A and B are sets, that difference consists of the members of A 
that do not belong to B plus those of B that do not belong to A. Suppose next 
that for some C in S 


(ANB) u (BNA)=C u C. 


Then, as a simple diagram shows, A and B are identical with (and hence 


3 P(AnB) z P(BnA) is tantamount of course to AP4; P(AN(BNC)) z P((ANB)NC) follows from 
AP5 by repeated applications of AP4; and P(A) z P(ANA) follows from AP3 (with B there taken 
to be A) and AP1. 

* Popper had published an earlier set of constraints in his [1938], but those—as he admits in his 
[1959], Appendix *ii—were ‘somewhat clumsy’. On p. 53, footnote 1, of his [1955] he splits up 
AP?’ into these two constraints: 

For some A and some B in S, P(A) x P(B) 
and 
For any A in S there is a B in S such that P(B) > P(A) and P(ANB)=P(A) x P(B). 
Why Popper favored AP1'-AP2' over AP1-AP2 is discussed in my [1989]. 


170 Hugues Leblanc 


indiscernible from) each other. So, generalizing upon this example, define ‘A is 
indiscernible from B’ or, for short, ‘A&B’ thus: 


AzB-y P(ANB)U(BNA)) = 1. 


Then, as shown in my [1989], Popper's constraints AP1-AP6 will deliver the 
results of replacing ‘=’ everywhere in A1-A5 by ‘=’ plus this version of LAP: 


LAP’. A =B .'. P(A)=P(B).° 


So AP1-AP6 compel S to constitute a Boolean algebra with respect to 
indiscernibility.* 

(d) It follows from Popper's account, as it does from Kolmogorov's, that 
P(ANA)=0 for any A in S. Since P(AUA)= 1 for any such A, S is sure to have 
at least two members, whatever kind of Boolean algebra it may be. 


4. By a relative probability space in the sense of Rényi understand a pair 
< «8,—n»,P2»,where 


( «S, —, n> is a Boolean algebra and 
(il) P is a real-valued binary function on S such that 
RI. For any A and B in S, Ox P(A,B), 
R2. For any A in S, P(A,A)=1, 
R3. Forany A, B, and Cin S, ifANB=DnD but C zZ DoD for some Din S, 
then P(AUB, C) - P(A, C)+P(B,C), 
R4. For any A, B, and C in S, P(ANB,C)=P(A,BNC) x P(B,C), and 
R5. There is at least one A and at least one B in S such that P(A,B) #1. 


Notes: (a) 'R' signals that except for R5 the constraints derlve from Rényi' in his 
[1964]. 
(b) Given A1-A5, R1, R2, and R4, this constraint 


R3'. For any A and B in S, if Bz CaC for some C in S, then 
P(A,B)=1—P(A.B) 


is equivalent—as proved in the Appendix—to constraint R3 and may therefore 
substitute for it. It brings Rényi’s account closer to Popper’s in 5. As also 
proved in the Appendix, A1—A5, R2, R3', and R4 yield this weakened version of 
RI: 


5 With B the same as A, the value of P is of course the same for argument B as for argument A. So 
LAP is trivially true. LAP’ is not, but proof of it is quite simple, as is proof that sis reflexive, 
symmetrical, and transitive, thus constituting an equivalence relation. 

é So far as I can ascertain, Popper nowhere provides definition of = for his absolute probability 
theory nor as a result proof that his set S constitutes a Boolean algebra with respect to 
indiscernibility. That there are Boolean algebras with respect to equivalence relations other than 
identity is recognized in Rosenbloom's [1950], pp. 9-10 and 13-14, and is presumed (though 
not explicitly stated) in Popper's [1959], Appendix *v. Vartous authors are unclear or downright 
faulty on this matter. 


The Autonomy of Probability Theory I71 


For any A and B in S, if P(CB)z1 for some C in S then 
P(À,B) — 1 — P(A,B), 


which appears as RP3 in Popper's account of a relative probability space and 
proves of critical importance there. 

(c) = plays here the same double role as in 2. The counterparts of LAP are 
these two Leibniz's Rules for Relative Probabilities 


LRP1. A=A’ .. P(A,B)=P(A’,B) 
and 
LRP2. B=B’ .”. P(A,B)=A,B’). 


Trivially true like LAP, they permit inferring from A1 these two Association 
Laws, to reappear in 5. 


P(An,C)=P(BNA,C) 
and 
P(A.BNC)=P(A,COB). 


(d) Suppose the restriction 'C 4 DoD for some D in S’ in R3 were dropped, and 
for the occasion abridge each of ‘ANA’ and ‘DoD’ as ‘@’. Since by A1-A5 
000 = and OU 0 -9, by R2, R3, and LRP1 P(0,0) would equal both 1 and 2. 
Or suppose the restriction ‘BACNC for some S' in R3’ were dropped. Then by 
R2 and R3' P(9,0) would equal 0. But, since 0o0- 9 by A1-A5, P(0,9) would 
also equal 1 by Lemma 3 in the Appendix. 

(e) The probability spaces just defined are but some of those acknowledged by 
Rényi, who introduces a second set S' and takes P to be a real-valued function 
on S x S’. In his [1955] S’ may be any non-empty subset of S, but in his [1964] 
it becomes any non-empty subset of S to which the null set does not belong but 
to which A u B does if A end B do. Because of the first restriction placed on S’ in 
[1964], R3 reads there 


For any A and B in S and any C in S', if AGB — DoD for some D in S, then 
P(AUB,C) =P(A,C) + P(B.C), 


and R3' would read: 
For any A in S and B in S', P(À,B) = 1 —P(A,B), 


a point to which I return in the next note. The probability spaces defined here 
are generalizations of their counterparts in Rényi, where the members of S, 
hence those of S', and hence the arguments of P had to be sets. Here the 
members of S may again be whatever there is a Boolean algebra of.” 


7 [n particular, propositions. The relative probability functions that then result are essentially 
those of Carnap's in his [1950]. In later writings Carnap sometimes favours those among 
Rényi's functions that meet this extra constraint. 


For any A and B in S, if P(A,BUB) — 1, then A=BUB. , 
For further information on these matters, see Leblanc [1989] and Leblanc and Roeper [1989]. 


172 Hugues Leblanc 


(f) Suppose that, as R2 has it, 
For any A in S, P(A,A)=1, 
and suppose further that, as R5 has it, 


There is at least one A and at least one B in S such that P(A,B) 41. 


Then, clearly, A and B cannot be the same and hence S has to have at least two 
members. But, given R2, if S has only one member, then P(A,B) equals 1 for 
any A and B in S and each of R1 and R4 is trivially true. And, with S one- 
membered, A has to be the same as A, hence this condition in R3’ 


B#CnC for some C in S 


is false, and hence R3’ is vacuously true. So, each of R1, R2, R3', and R4 is 
compatible with S having only one member. So, if S is to be compelled to have 
more than one member, then R5 or some other constraint to the same effect 
must be placed upon P. The same reasoning goes through with R3 in place of 
R3'. However, in Rényl's own account of things R5 automatically holds true 
and hence is dispensed with. For suppose S there had but one member, say, A. 
Then A and A would again be the same, hence A and ANA would be the same, 
and hence by the second of Rényi's 1964 specifications regarding S', A could 
not belong to S’. But, if so, then would be empty contrary to the first of those 
specifications.® 


5. By a relative probability space in the sense of Popper understand a quadruple 
<S, —, N, P>, where S is a non-empty set, — a function from S into S, ^ one 
from S x 8 into S, and P a real-valued binary one on S such that 


RP1=R1=For any A and B in S, Ox P(A,B), 

RP2=R2=For any A in S, P(A,A)— 1, 

RP3. For any A and B in S, if P(C,B) # 1 for some C in S, then 
P(A,B)=1—P(A,B), 

RP4=R4=For any A, B, and C in S, P(ANB,C)=P(A, BNC) x P(B,C) 

RP5=R5=There is at least one A and at least one B in S such that 
P(A,B) 1, 

RP6. For any A, B, and C in S, P(ANB,C) x P(BNA,C), and 

RP7. For any A, B, and C in S, P(A,BNC) x P(A,CoB).? 


Notes: (a) With 'R' for ‘relative’ and 'P' again for ‘Popper’, ‘RP’ signals that the 
constraints derive from Popper's [1957], which ran: 
5 The argument, shorter than my own, was suggested by the referee. 
9 That RP1-RP7 also deliver 
P(An(BaC),D) = P((AnB)nCD), 
P(A,BA(CAD)) = P(A,(BnC)nD), 


P(AnA,B) = P(A,B), 
P(A,BnB) = P{A,B), 


etc., follows from the result of Popper’s reported in (b). 


The Autonomy of Probability Theory 173 


RPI’. For any A, B, and C in S, P(AnB,C) <P(A,C), 

RP2'. For any A and B in S, P(A,A) — P(B,B), 

RP3'. For any A and B in S, if P(C,B) zz P(B,B) for some C in S, then 
P(A,B) = P(B,B) — P(A,B), 

RP4' - RP4 — For any A, B, and C in S, P(ANB,C)=P(A,BNC) x P(B,C), 

RP5'. For at least one A, one B, one C, and one D in S, P(A,B) # P(C,D), and 

RP6'. For any A, B, and C in S, if P(A,D) 2 P(B,D) for every D in S, then 
P(C,A) = P(C,B).?° 


Popper states in [1959], Appendix *iv, that RP1'-RP4 and RP6’ are equivalent 
to RP1-RP4 and RP6-RP7, and gives proof of it except in the case of RP6’. The 
first published proof that RP1-RP4 and RP6-RP7 yield RP6' may be in my 
[1981]; it is the shortest, at any rate, that has come to my attention. That, 
given RP1'-RP4' and RP6', constraints RP5' and RPS are equivalent was 
noted by Popper shortly after the publication of his [1959].!! 

(b) As remarked on p. 167, the set S in Popper's account is not presumed to 
be a Boolean algebra with respect to identity; and nowhere in the account Is 
any member of S declared identical with another. But define 'A is substitu- 
tionally equivalent to B’ or, for short, ‘AB’ thus: 


A&B = y P(A,C)=P(B,C) for every C in S. 


Then, as Popper showed in his [1959], Appendix *v, constraints RP1'-RP6' 
(hence also constraints RPI-RP7) deliver the results of replacing ‘=’ 
everywhere in (constraints of Huntington's equivalent to) A1—A5 by ‘=’. It 
also delivers these versions of Leibniz's Rules for Relative Probabilities: 


LRP1’. Az A' .. P(A,B) - P(A',B) 
and 
LRP2’. B&B’ .”. P(A,B) - P(A.B). 


So RPI'-RP6' compel S to constitute a Boolean algebra with respect to 
substitutional equivalence.!? Since 


P(A,C)=P(B,C) for every C in S ff, and only if, 
P((An^B) vu (BNA),C)=1 for every such C, 


Popper's relation of substitutional equivalence is one of indiscernibility. So his 


10° For some reason 1956 Is given by many, Popper included, as the publication date of Mace's 
book. Yet my copy of it says ‘First Published in 1957’ on the back of the title page. RP1'-RP6' 
will be found on p. 191 of [1957], where Popper calls them a 'slight improvement' of 
constraints in [1955], p. 56: ‘a considerable improvement’ would be the apter characterization. 
Why Popper favors 'P(B,B)' over '1' in RP2' and RP3', and why he favors RP1' over RP1, is 
discussed in my [1989]. Popper's first use of RPS in lieu of RP5' may have been in his [1962]. 

11 Por yet another proof that constraints RP1'-RP6' and RP1-RP7 are equivalent, see Harper, 
Leblanc, and van Fraassen’s [1983]. 

12 Popper points out of course that substitutional equivalence is reflexive, symmetrical, and 
transitive. 


174 Hugues Leblanc 


result, coupled with that in 3, shows the set S in a Popper probability space 
<S,—, N, P» to constitute a Boolean algebra with respect to indiscernibility, 
this whether the function P there is an absolute probability function or a 
relative one. 

(c) Though Popper talked in his [1959], Appendix *v, of substitutional 
equivalence rather than identity, he nonetheless abridged 


P(A,C) =P(B,C) for every Cin S 
as 
A=B 
rather than, say, 
ASB 


as I just did. This has made for misunderstandings, some readers concluding 
that given Popper's definition RP1'-RP6' deliver constraints Al-A5 them- 
selves rather than the results of replacing ‘=’ everywhere in A1—45 by ‘=’. To 
clear up matters, consider the following binary function P; on the three- 
membered set {a,b,c} such that à- 5—c, 6—a, anb=b, and anc=boc=c: 


B 
P2(A,B) a b c 


1 1 i 
A b 1 1 1 
c 0 0 1 


P4 meets constraints RP1'-RP6', as the reader may verify. So by Popper's 
result {a,b,c} constitutes a Boolean algebra with respect to indiscernibility. 
Incidentally, the unary function P, on {a,b,c} such that P,(A) - P(A,a u à) 
meets constraints AP1—AP6. So {a,b,c} constitutes a Boolean algebra with 
respect to indiscernibility under P, as well as under P;. But it cannot constitute 
a Boolean algebra with respect to identity: finite Boolean algebras of that sort 
are only of the sizes 2, 4, ..., 2, .... Quite importantly, though, P; does 
constitute a relative probability function in Popper's sense and P, an absolute 
one. So, by allowing the set S on which function P in a probability space <S,—, 
N, P> is defined to be arbitrary Popper ushered in a host of new probability 
functions, relative functions and absolute ones. The point is studied at length 
in Leblanc and Roeper’s [1989]. 

To pursue the matter a bit bit further, suppose the set on which a probability 
function P of Popper’s is defined happens to be a Boolean algebra with respect 
to identity, and suppose P is an absolute function. Since K3 and K3' are 
equivalent constraints, P ts then sure to be a Kolmogorov function as well. 
Suppose, however, P is a relative function. Since constraint R3—as proved In 


The Autonomy of Probability Theory I75 


the Appendix—is stronger than constraint R3', P need not be a Rényi function 
as well. Indeed, as shown in Roeper and Leblanc's [1989], Rényi's functions 
are only those among Popper's relative probability functions that meet this 
extra constraint 


If P(A,C) =P(B,C) for every C in S, then A=B, 


according to which indiscernibility is tantamount to Identity. So, by weaken- 
ing R3 to read like R3', Popper ushered in yet other relative probability 
functions. 


6. The constraints placed in 3 on absolute probability functions and those 
placed in 5 on relative ones are independent: none of AP1—AP6, for example, 
can be gotten from the rest, nor can any one of RP1-RP7 be. Anxious, 
however, to further accentuate the autonomy of probability theory, Popper 
introduced in [1959], Appendix *iv, a further notion, that of 'autonomous 
independence'. His constraints RP1'-RP6', for example, are autonomously 
independent in that none can be gotten from the rest even in the presence of 
A1-A5 and LRP1-LRP2. Two of constraints RP1-RP7, on the other hand, are 
not autonomously independent: though RP6 cannot be gotten from RP1-RP5 
and RP7, it can be gotten from A1 and LRP1 (hence, from RP1-RP5, RP7, A1- 
A5, and LRP1-LRP2), and a like remark applies to RP7. As a result, Popper 
strongly favored RP1'-RP6' over the more current RP1-RP7. (Note, however, 
that RP1’, RP2, RP3, RP5, and RP6' would also be autonomously independent 
and in my opinion are better intended than Popper's own constraints: use of 
‘P(B,B)’ in lieu of ‘1’ is distracting.) 

Popper's constraints AP1'-AP2' and AP3-AP6, on the other hand, fare no 
better as regards autonomous independence than the more current AP1—AP6 
do. Three constraints common to both sets are not autonomously independent, 
to wit: AP4-AP6, which follow from A1—4A3 by dint of LAP. And failure to 
come up with autonomously independent constraints for absolute probability 
functions may be one reason why Popper eventually lost interest in them.!* 

However, even Popper's claim of autonomous independence for RP1’—RP6’ 
can be challenged.!? Suppose indeed that as in some versions of the Boolean 
Algebra of Classes the inclusion relation c were used in lieu of the identity 


13 Popper states in his [1959] that AP1'-AP2' and AP3-AP6 (hence, AP1—-AP6) are independent. 
Proof that RP1'-RP6' are independent ts provided in his [1959], Appendix “iv, and proof that 
RP1-RP6 are may be found in Harper, Leblanc, and van Fraassen's [1983]. 

1* Another was his fatlure to define relative probability functions in terms of absolute ones. But 
that cannot be done, and that it cannot hardly reflects on the latter functions. Despite Popper's 
eventual disenchantment with them, absolute probability functions are still in use tn and out of 
science. Relative ones can be, and have been, put to similar use, but I doubt they will ever render 
their restrictions to AVA obsolete. 

15 I shall further examine Popper's claim in a subsequent paper. 


176 Hugues Leblanc 


relation —, the postulates for Boolean algebras in Bernstein's [1934] were 
enlisted in lieu of A1—A5, and 


LRP1". Ac B ~. P(B.CO € P(A,C) 
were enlisted in lieu of LRP1. Then with = defined here thus 
A=B=pAcBandBcA 
and c defined in 2 thus 
AcB-pr AnB-—A, 


the two accounts would deliver exactly the same Boolean algebras. But one of 
RP1'-RP6', to wit: 


For any A, B, and C in S, P(AMB,C) x P(A.C), 


would follow by dint of LRP1' from the constraints doing duty for A1—45; and, 
similarly, one of AP1'-AP2' and AP3-AP6, to wit: 


For any A and B in S, P(ANB) x P(A), 
would follow from them by dint of this version of LAP 
LAP”. AcB .. P(B) x P(A). 


Note, by the way, that using RP1 in lieu of RP1' would not help in the first case: 
it is together with RP7 that RP1 can do duty for RPI’. 

Possibly aware of this and anxious in any event for his constraints to be 
'creative', Popper in his [1963] placed on relative probability functions yet 
other constraints, autonomous, independent, and autonomously independent 
(whether Boolean algebras be defined the identity way or the inclusion way). 
The first requires in the manner of RP5 that the set S have at least two 
members, the second is a ‘creative’ constraint on P(A,C) which nowhere uses 
‘a’, and the third is a ‘creative’ constraint on P(AnB,D) which nowhere uses 


4 *. 


A-— R5 — For at least one A and at least one B in S, P(A,B) z 1, 
B. For any A and Bin S, P(À,C) = P(B.C) for every Cin Sif, and only if, for any D 
in S, 


P(A,D)+P(B,D)=P(D,D) if, and only if, 
P(D,D) x P(E,D) for some E in S, and 
C. For any A, B, and C in S, P(AMB,D) =P(C,D) for every D in S tf, and only if, 
for any E and F in S, 


(i) P(A,B) <P(C,B), 
(ii) P(A,E) >P(C,E) < P(B,C), and 
(iii) 1£ P(B,E) < P(F,E) and P(B,F)>P(E.F) <P(E,F), then 
P(A.F) x P(B,E)=P(C,E). 


The Autonomy of Probability Theory 177 


Proof that A-C yield RPI'-RP6' will be found in Popper's [1963] and is 
breathtakingly simple. 

As they bear no resemblance whatever to previous constraints on relative 
probability functions, A-C conceal the provenance of probability theory and, 
in my opinion, obscure rather than illuminate what probability is. But, like the 
constraints in 3 and 5, they are a veritable tour de force and demand further 
study, as does Popper's notion of creative probability constraints. The notion is 
akin to that of separated constraints and rules, but intriguingly differs from it. 


APPENDIX 
A. That, given A1-A5, K1-K2, and LAP, 
K3. If AGB - CnC for some C in S, then P(AUB)=P(A) + P(B) 
and 
K3'. P(A) =P(AMB)+ P(AnB). 


are equivalent. 


(i) Suppose K3 given. 


Theorem 1. P(A) - P(ACOB)À- P(AnQB) (= K3’). 

Proof: (ANB)N(AUB)=BnB by Al-A5. Hence P((AnB)o(AnB))- 
P(AnB)-4-P(AnB) by K3. But A-(AnB)u(AnB) by AI-A5. Hence P(A} 
z: P((AnCB)u(AnB)) by LAP. Hence Theorem 1. 


(it) Suppose K3' given. 


Lemma 1. P(ANMB) x P(A). 
Proof: P(AnB) - P(A) — P(AnB) (K3’) 
<P(A) (K1). 
Lemma 2. P(ANA)=P(A). 
Proof: ANA=A by AS. Hence Lemma 2 by LAP. 


Lemma 3. P(A)=P(BNA)+P(BNA). 
Proof by K3’, A1, and LAP. 


Lemma 4. P((ANA)AB) =0. 


Proof: P((ANA)AB) <P(ANA) (Lemma 1) 
<P(A )—P(ANA) (K3') 
<0 (Lemma 2) 
=0 : (K1). 


Lemma 5. P(AnA)=1. 
Proof: AvA=AnA by A1-A5. Hence Lemma 5 by LAP. 


178 Hugues Leblanc 
Lemma 6. P(A)=1—P(A). 





Proof: P(A) =P((ANA)NA)+P (ANA)NA (Lemma 3) 
=P((ANA)NA) (Lemma 4) 
=P(ANA)—P((ANA)OA) (K3’) 
=1—P((ANA)nA) (Lemma 5) 
=1—P(A)+P((ANA)NA) (K3’) 
=1—P(A) (Lemma 4). 


Lemma 7. P(ANA)=0. 
Proof by Lemma 5 and Lemma 6. 


Lemma 8. If ANB=CnC for some C in S, then P((AUB)nB) -P(A). 
Proof: If ANB=CoC for some C in S, then (AUB)NB=A by A1-A5 and 
hence P((AUB)n^B) 2 P(A) by LAP. 


Lemma 9. P((AUB) ^ B) -P(B). 
Proof: (AUB) ^ B=B by A1-A5. Hence Lemma 9 by LAP. 


Theorem 2. If ANB=CoC for some C in S, then P(AUB) — P(A) 4- P(B) (=K3) 
Proof: Suppose ANB=CoC for some C in S. 


P(A v B)-P((AUB)^B) + P((AUB)MB) (K3’) 
=P(A)+P((AUB)OB) (Lemma 8 and hyp.) 
=P(A)+P(B) (Lemma 9). 


B. That, given A1-A5, R1-R2, R4, and LRP1-LRP2, 


R3. if AGBzDoD and CzDobD for some D in S, then 
P(AUB,C) =P(S,C) + P(B,C) 
and 

R3'. If B ÆC, then P(A,B) = 1 —P(A,B) 


are equivalent. 
(i) Suppose R3 given. 


Lemma 10. P(A,BrB) - P(A,B). 
Proof: BOB=B by AS. Hence Lemma 10 by LRP2. 


Lemma 11. If ANB=B, then P(A,B)— 1. 
Proof. Suppose ANB=B. 


P(B,B) - P(AnB,B) (LRP1 and hyp.) 
=P(A,BoB) x P(B,B) (R4) 
=P(A,B) x P(B,B) (Lemma 10). 


Hence P(A,B) —1 by R2. 


Theorem 3. If B4 CC for some C in S, then P(À,B)- 1— P(A,B) (=R3’). 
Proof: If B CC for some C in S, then P(ANA,B) = P(A,B) + P(A,B) by R3, 
and hence P(À,B) — 1 — P(A,B) by Lemma 11. 


The Autonomy of Probability Theory 179 
(ii) Suppose R3' given.!$ 


Lemma 12. If B=CnC, then P(A,B) «1, 
Proof: If B—-CnC, then ANB=B by A1-A5 and hence P(A,B)=1 by Lemma 
11.` 


Lemma 13. P(ANB,C)=P(BNA,C). 
Proof by Al and LRP1. 


Lemma 14. If CADAD for some D in S, then P(A,C) - P(A^B,C) 4- P(An B.C). 
Proof: Let C£ DoD for some D in S. 


Case 1: ANC=DoD. Then ANC=C by A1-A5, hence P(A,C)=1 by Lemma 
11, hence P(A,C)=0 by R3' and the hypothesis on C, and hence 
P(A,C)=P(B,ANC) x P(A,C) + P(B,ANC) x P(A,C) 

=P(BNA,C)+P(BOA,C) (R4) 
—P(AnB,C) +P(ANB,C) (Lemma 13). 


Case 2: ACC DoD. Then 1=P(B,ANC)+P(B,ANC) by R3’, and hence 
P(A,C)=P(B,ANC) x P(A,C) + P(B,ANC) x P(A,C) 
=P(BNA,C)+P(BoA,C) (R4) 
—P(AnB,C) + P(ANB,C) (Lemma 13). 


Lemma 15. P((AUB)OB,C) =P(B,C). 
Proof: (AUB)OB=B by A1-A5. Hence Lemma 15 by LRP1. 


Lemma 16. P((AUB)OB,C)=P(AcB,C). 
Proof: (AGB) B — AnB by A1-A5. Hence Lemma 15 by LRPI. 


Lemma 17. P(AUB,C) 2 P(A,C) + P(B,C) - P(AMB,C). 
Proof: 

Case 1: C DoD. Then Lemma 17 by Lemma 12. 

Case 2: C#DAD. Then 


P(AUB,C) = P((AUB)AB,C) +P((AUB)B,C) (Lemma 14) 
= P(B,C) + P(AUB)^B,C) (Lemma 15) 
=P(B,C)+P(AMB,C) (Lemma 16) 
= P(B,C) + P(A,C) — P(AMB,C) (Lemma 14). 


Theorem 4. If ANB=DoD and CzDnD for some D in S, then P(AUB,C)= 
P(A,C) + P(B,C) (= R3). UM 

Proof: Suppose that ANB= DoD for some D in S, in which case (ANB) AC=C 
by A1-A5, and suppose that Cx DAD for some such D. Then P(ANB,C)=1 by 
Lemma 11, hence P(ANB,C)=O by R3' and hence P(AUB,C)= 
P(A,C) =P(B,C) by Lemma 17. 





C. For proof that A1—A5, R2, R3', and R4 yield 
16 The proofs that follow are largely adaptations of proofs on pp. 93-4 of von Wright's [1957]. 


r80 Hugues Leblanc 
RP3. If P(C,B) for at least one C in S, then P(A,B) 2 1 — P(A,B). 


Suppose that P(C,B) # for at least one Cin S. Then B # CC for that C by Lemma 
12, and hence P(A,B)=1—P(A,B) by R3'. (Proof of Lemma 12 uses only 
A1-A5 and Lemma 11, proof of Lemma 11 uses only R2 and R4, and proof of 
Lemma 10 uses only Al.) 


D. For proof that A1—A5, R1-R2, RP3, and R4-R5 do not yield R3. 
If Bz: CC for at least one C in S, then P(A,B) - 1 — P(A,B), 


and hence that RP3 is weaker than R3. Let S be any four-membered Boolean 
algebra (a,a,ana,auva) and P be this real-valued binary function on S 


B 
P(A,B) a a ana ava 
a 1 0 1 0 
A a 1 1 1 1 
anā 1 0 1 0 
ava I. 1 1 


It is easily verified that P meets RI-R2, RP3, and R4-R5. 
Yet P(A,a) z 1 —P(A,a) for every A in S even though a #anā.!7 


Temple University 


REFERENCES 


BERNSTEIN, B. A. [1934]: ‘A Set of Four Postulates for Boolean Algebras in Terms of the 
"Implicative" Operation', Transactions of the American Mathematical Society, 36, pp. 
876-84. 

BYRNE, L. [1946]: "Two Brief Formulations of Boolean Algebra’, Bulletin of the American 
Mathematical Society, 52, pp. 269-72. 

CARNAP, R. [1950]: Logical Foundations of Probability. Chicago University Press. 

HARPER, W. L., LEBLANC, H. AND VAN FRAASSEN, B. C. [1983]: 'On Characterizing Popper 
and Carnap Probability Functions’, in Essays in Epistemology and Semantics, Haven 
Publications, pp. 140-52. 

HUNTINGTON, E. V. [1933]: ‘New Sets of Postulates for the Algebra of Logic’, Transactions 
of the American Mathematical Society, 35, pp. 274-304. 


17 Thanks are due to the referee and to David Miller for their criticism of an earlier draft and 
suggestions, to Peter Roeper who thought through with me many ofthe points made here, and 
to John Serembus who helped proofreading the paper. 


The Autonomy of Probability Theory 18r 


Koimocorov, A. N. [1933]: ‘Grundbegriffe der Wahrscheinlichkeitsrechnung', Ergeb- 
nisse der Mathematik, 2, pp. 1-61. 

LzsLANC, H. [1981]: ‘What Price Substitutivity: A Note on Probability Theory’, 
Philosophy of Science, 48, pp. 317-22. 

LeBLANC, H. [1982]: ‘Popper's 1955 Axiomatization of Absolute Probability’, Pacific 
Philosophical Quarterly, 1982, pp. 133-45. 

LEBLANC, H. [1989]: ‘Popper's Formal Contributions to Probability Theory’, in 
Perspectives on Psychologism, Martinus Nijhoff, pp. forthcoming. 

LEBLANC, H. and ROEPER, P. [1989]: ‘On Relativizing Kolmogorov's Absolute Probability 
Functions’, Notre Dame Journal of Formal Logic, pp. forthcoming. 

Popper, K. R. [1938]: ‘A Note on Probability’, Mind, pp. 275-7. 

PorPER, K. R. [1955]: ‘Two Autonomous Axiom Systems for the Calculus of 
Probabilities’, British Journal for the Philosophy of Science, 6, pp. 51-7. 

PorpeR, K. R. [1957]: ‘Philosophy of Science: A Personal Report’, in British Philosophy 
in the Mid-Century, George Allen and Unwin, pp. 153-91. 

- Popper, K. R. [1959]: The Logic of Scientific Discovery, Basic Books, Inc. 

Poprer, K. R. [1962]: Conjectures and Refutations, Basic Books, Inc. 

Popper, K. R. [1963]: ‘Creative and Non-Creative Definitions in the Calculus of 
Probability’, Synthese, 15, pp. 167-86. 

Rényi, A. [1955]: ‘On a New Axiomatic Theory of Probability’, Acta Mathematica Acad. 
Scient. Hungaricae, 6, pp. 286-335. 

RÉNY1, A. [1964]: ‘Sur les Espaces des Probabilités Conditionnelles’, Annales de l'Institut 
Poincaré, N Série, Section B1, pp. 3-21. 

RozrER, P. and LEBLANC, H. [1989]: ‘Indiscernibility and Identity, in Probability 
Theory', forthcoming. 

ROSENBLOOM, P. C. [1950]: The Elements of Mathematical Logic, Dover Publications, Inc. 

Surres, P. [1974]: ‘Popper's Analysis of Probability in Quantum Mechanics’, in The 
Philosophy of Karl Popper, Northwestern University, pp. 760-74. 

von Wriaar, G. H. [1957]: The Logical Problem of Induction, second revised edition, 
Macmillan and Co., Ltd. 


Brit. J. Phil. Sci. 40 (1989), 183-184 Printed in Great Britain 


A Refutation of Popperian 
Inductive Scepticism 


KEN GEMES 


According to Popper 
(A) P(Fa/Fb) = P(Fa). 


For Popper this ts the heart of his inductive scepticism. From (A) we may easily 
derive 


(B) P( ~ Fa/ ~ Fb) =P(~ Fa). 
Presumably Popper would also accept the following claims 
(C) P(Fafa x b&Fb) = P(Fa) 

(D) P( ~ Faja XE b& ~ Fb) — P( — Fa). 


To claim, for instance, that a+ bGFb is favorably relevant to Fa is to embrace, 
at least, a limited form of inductivism. On the other hand, to claim that 
a+ b&Fb is unfavorably relevant to Fa is to embrace, at least, a limited form of 
counter-induction. Presumably Popper is as eager to avoid counter-induction 
as he is eager to avoid induction. Therefore we may safely presume Popper 
would accept (C) and (D). 

Unfortunately for Popper the conjunction of (A), (B), (C) and (D) is 
inconsistent. 


Proof? 
1. P(Fa/Fb) = P(Fa) [Ass.] 
2. P(Fa/a#b&Fb)=P(Fa) [Ass.] 
3. P(as&bGFa v a=b&Fa/Fb)=P(Fa) [1, Eq. Pr.] 
4. P(a#b&Fa/Fb) + P(a—b&Fa/Fb) 4-0 — P(Fa) [3, Add. Pr.] 
5. P(a#b/Fb) x P(Fa/as: bGFb)-- 


P(a#b/Fb) x P(Fa/Fb&a-b) | —P(Fa) [4, Conj. Pr.] 


! Cf. K. Popper, The Logic of Sctentific Discovery, sixth Impression (revised [1972]), Hutchinson & 
Co., London, p. 367. For convenience I have altered Popper’s terminology. For instance, I use 
‘Fa’ where Popper uses 'aj'. Nothing substantive rests on this exchange. 

? [n this argument the following abbreviations are used: 'Ass.' for 'Assumption', 'Eq. Pr.' for 
‘Equivalence Principle’, ‘Add. Pr.’ for ‘Additton Principle’, ‘Con. Pr.’ for ‘Conjunction Principle’, 
‘Neg. Pr.’ for ‘Negation Principle’ and ‘Ar.’ for ‘Arithmetic’. 


184 Ken Gemes 


6. P(a¢b/Fb) x P(Fa)+ P(a=b/Fb) =P(Fa) [2,5] 
7. P(a#b/Fb)+P(a=b/Fb)/P(Fa) =1 [6, —- by P(Fa)] 
8. P(az:b/Fb)-- P(a— b/Fb) =1 [Neg. Pr.] 
9. P(a=b/Fb)=P(a=b/Fb)/P(Fa) [7, 8, Ar.] 
10. 1=1/P(Fa) [9,+by  P(a=b/ 
Fb)] 
11. P(Fa)=1 [10, x by P(Fa)] 


By the same reasoning, uniformly replacing ‘Fa’ with '~ Fa’ and ‘Fb’ with 
‘~ Fb’ we obtain the result P(~ Fa) — 1. Yet from 11. above and the Negation 
Principle we obtain P( ~ Fa)=0. 


If Popper is correct in his assumption that inductive scepticism entails (A), and 
hence (B), and assuming inductive scepticism equally entails (C) and (D), we 
may regard the above argument as a refutation of inductive scepticism.? 


3 It would perhaps be more accurate to speak here of the combination of inductive scepticism and 
counter-tnductive scepticism entailing (A), (B), (C) and (D). Then, in as much as we accept the 
above argument, we would be left with the choice of embracing some form of induction or some 
form of counter-induction. 


Department of Philosophy 
University of Pittsburgh 


Brit. J. PhiL Sct. 40 (1989), 385-190 Printed in Great Britain 


REVIEW ARTICLE 


The New Experimentalism* 


Allan Franklin has written an important book that should be required reading 
for philosophers of science. Franklin, an experimental high-energy physicist 
who is now concentrating on the history and philosophy of science, offers 
precisely what might be expected from an experimental physicist; a layout of 
data concerning experimental sequences in physics. Four important experi- 
mental sequences are considered in detail; the discovery of parity nonconser- 
vation, the discovery ofthe failure of CP invariance, Millikan's discovery of the 
unit charge of the electron, and the nondiscovery of parity nonconservation. 
The last of these, marked by the failure to observe something important, is 
especially interesting, since it avoids projecting experimental sequences worth 
recovering as only those that culminate in success. The data gathered from the 
experimental sequences is deployed to question existing pronouncements by 
philosophers. of sclence on experiment, but Franklin does not develop an 
articulated general philosophy of science that incorporates his new experi- 
mentalism. This book is a provocation for any current philosopher of science 
who would develop an account of science, that could take the details of actual 
experimental practice into account. 

If philosophers will have to draw some of their own conclusions from 
Franklin's study, the significance of his focus on experimentation is worth 
some discussion. The philosophy of science that developed out of 20th century 
positivism placed a heavy foundational emphasis on observational fact as the 
means of controlling theoretical growth, but although theoretical statements 
were logically articulated against observational statements in increasingly 
sophisticated ways as positivism developed, positivism paid little attention to 
the way in which statements of observational fact were produced in 
experimental practice. One simply began to philosophize on the assumption 
that science was capable of delivering a data base of settled observational 
statements. In the vengeful dismantling of positivism undertaken after Kuhn's 
work, the old connotations of fact were replaced by assertions that theoretical 
expectations somehow determined the observations of science, and later by 
suggestions that observations are constructed by groups of scientists engaged 
in a social process of negotiation. Neither of these positions is really compatible 
with the intuition that experimental observation and theoretical conjecture 


* Review of Allan Franklin [1986]: The Neglect of Experiment. Cambridge: Cambridge University 
Press. xii -- 290 pp. ISBN 0-521-32016-X. 


186 Robert Ackermann 


should, somehow, have a symmetrical status with respect to scientific 
development, either betng capable of producing a temporary fixed point onto 
which progress can be hinged. In order to recover this fundamental intuition 
based on scientific practice, a number of recent studies have gone back to the 
history of science to study experimentation as a means of grounding the nature 
and origins of the observational facts that many still think must offer 
important objective constraints for scientific theory, at least at certain pivotal 
points in scientific development. Empiricism ts in this way resurrected 
(although transformed) by finding a philosophical account of experimental 
data that are especially worth having, and that function as solid hinge points 
for controlling theoretical conjectures. Experiment must be important to 
science, or it would die out as an outmoded fashion. Some philosophical 
account of its ongoing Importance seems required. 

A philosophical context for Franklin's work is provided by Hacking [1983]. 
Hacking suggests that the old model of the structure of science, in which layers 
of theory and observation are brought into a logical linkage by such notions as 
explanation and confirmation, needs to be replaced by a set of activities in 
Science including speculation, calculation, and experimentation. These activi- 
ties are to be related by the production of (usually simplified) models in a 
context that allows clear points of contact, the models being developed with an 
eye to easy computation and accessible experimental verifications. Hacking's 
survey of the complexity of actual scientific practice makes a powerful case for 
the replacement that he proposes. An important point of comparison between 
the old notion of observation and the newer concentration on experimentation 
is that an experiment is a complex activity undertaken over time (involving the 
design and manufacture of equipment, the calibration of equipment, checks on 
the proper functioning of the equipment, etc.) that may issue in observations 
that can be reported as data. What's needed tn this context is a discussion of 
whether specific experimental practices can in some sense legitimate or 
validate observational reports, and how the strength of such legitimation 
might be taken into account in a philosophy of science. Hacking's suggestion 
points to a legitimate area of exploration, but at this point no settled directions 
of development for the new experimentalism have come into view. 

If the notion of experimentation is to be central to a new philosophy of 
science, there are at least two possibilities to consider: that the notion of 
experimental legitimation could be exploited towards the end of providing a 
new experimental foundationalism, and that the notion of experimental 
legitimation could be exploited towards the end of providing a dialectical 
account of scientific progress in which either theory or experiment could 
provide the temporary support and constraint for the tentative advance of the 
other, given the concrete set of scientific practices available at a specific point 
in time. Franklin tends towards the first of these options, asking as his central 
philosophical questions what role experiment plays in theory selectlon or 


The New Experimentalism 187 


confirmation, and how experiments can be organized so as to result in the 
rational separation of experimental fact from experimental artifact. But no 
matter how persuasively rational a retrospective account of physics experi- 
ments may be, such an account cannot settle the question of whether a 
methodology involving experimentation can be made applicable to contem- 
porary growth points in physical theory. Such a methodology would require a 
framing philosophy that Franklin does not provide in philosophically satisfy- 
ing detail. At times, Franklin wavers subtly between suggesting that there is an 
epistemology that can distinguish experimental fact from artifact, and 
suggesting the somewhat more sceptical conclusion that observational facts 
can be accepted as valid when all of the plausible sources of error tn the relevant 
experimental sequences have been eliminated. The gap between these two 
positions could only be closed by a philosophical account of plausibility. In 
tending to think of fraud, or outright mistake, rather than of error, as the major 
foil to validated experimental work, Franklin slips past some of the difficulties 
with the notion of plausibility that seems so crucial to his methodological 
remarks. That Franklin hasn't yet closed the gap is evident in his discussion of 
Millikan's oil drop experiment, where his account of Millikan's apparently 
prescient and simultaneously seemingly arbitrary exclusion of specific data 
ralses once again the rough details of practice that always seem to cut against 
the idea that a generally satisfactory epistemology can be teased out of the 
experimental narratives. It is the rough data, contrasted with the philosophical 
temptations, that provides the philosophical excitement in Franklin's discus- 
sion. 

It would seem reasonable to ask whether the narratives offered, no matter 
how stimulating in their detail, can be regarded as the indubitable core of the 
experimental sequences in question. The reaction to Hanson [1963] suggested 
that different laboratories involved in the same major experimental sequences 
can be inserted at the nodes of a kind of parody of popular accounts of relativity 
theory in which each laboratory sees itself as the centre of progress, its own 
activities causing the reactions and developments in the other laboratories. 
Probably there would be spectalist quibbles with these narratives, and 
dubtettes expressed by some of the participants, but these narratives suggest 
that a crucial level of scientific historiography is now being attained by 
scientists who become reflective historians of their craft. This kind of internal 
history is likely to be decisive in the next stages of plecing together more 
adequate philosophies of science. Franklin’s narratives are basically exposi- 
tory, but the level of exposition depends on some prior knowledge of the 
physics involved. Many philosophers will have to put Franklin’s book down 
and consult other material if they are to get beyond the gist of the reasons for 
the experimental sequences. Between the literature suggested in the footnotes 
and the bibliography, however, it is possible to come onto a quite detailed 
understanding of these experimental sequences and their complexities from a 


188 Robert Ackermann 


rather modest basis. Working at Franklin's text is likely to put some useful flesh 
onto the bare bones of a philosopher's conception of the reporting of 
observational data. This is history at a level that should be the input into the 
philosophy of sclence as its experimental check. 

An important subtext of Franklin's histories concerns the question of why 
experiment has been rigorously neglected in history and philosophy of science 
by comparison to theory. Almost all physicists were experimentalists (1f also 
theorists) until the 20th century, giving physics a grounding in an intimate 
knowledge of experimentation that would be difficult to locate in most 
philosophical accounts of theorizing in physics. The emerging age of 
autonomous theorists was caught up into emerging positivism in a way that 
seems in retrospect to have caused experimentalists to recede automatically 
into the background, partly because positivism looked at the logical structure 
of written scientific books and papers, and writing is both essential to history 
and yet biased towards the representation of theory. An irony of Franklin's 
book is that an actual experimental set-up is only portrayed once in its fully 
contingent form, and that in the glorious confusion of apparatus in the dust 
Jacket photograph. Inside, as an all 'histories' of experimentation, experimen- 
tal set-ups are given in schematic diagrams that portray the theory of how 
apparatus could work so as to produce meaningful data, and observational 
data are represented in the smoothed form gathered from properly working 
apparatus (with the notable exception of reproductions of some of Millikan's 
data sheets, and some reproductions of electron micrographs). In fact, the 
details of experimental sequences represent an almost irrational opportunism 
to the orderly philosophical mind. An experimentalist may wish to measure a 
certain phenomenon, but have available an apparatus that can only measure 
another phenomenon, so that progress involves turning the first phenomenon 
into the second so that it can be measured. It may be that one piece of 
apparatus designed to produce the phenomenon to be measured can only be 
placed where one would like to place the only apparatus that can apparently 
measure the phenomenon. In a dizzying variety of such variations, experimen- 
talists must permute and adjust what is available in order to simulate what is 
desired. The validity of experimental results for sclentists often depends on an 
intimate sclentific grasp of what's avallable in the way of equipment, 
threatening any logical narrative that doesn't fill in this surround with failing 
to produce the data required for an understanding of scientific judgments of 
validity. The constraints on experimental sequences are but one of the things 
that must be more reflectively explored before the inherent problems in writing 
experimental histories can be resolved in a manner that would allow a fuller 
probing of the possibilities latent in the new experimentalism for the 
philosophy of science. 

Galison [1987] provides an associated and complementary look at experi- 
mental sequences in physics. Galison asks explicitly why experiments end, that 


The New Experimentalism 189 


is, why experimenters stop performing a given experiment, and move on to 
new ones. A refinement in Galison’s account relevant to evaluating Franklin is 
that Galison finds the appropriate ending of experiments in his experimental 
sequences to depend on the kind of measuring and data analysis devices that 
are involved in the sequences. Also dealing with some modern high-energy 
experiments, Galison exploits a distinction between an image-producing 
apparatus and a counting apparatus. An experiment involving an image- 
producing apparatus often ends appropriately with a ‘golden event,’ that is, a 
picture or image of something whose existence has been conjectured, but 
possibly questioned. An experiment involving a counting apparatus often ends 
appropriately when a decision based on some probability model suggests that 
enough counts have been taken for some purpose. A counting sequence will 
typically not have quite as decisive a final (ending) event. Galison is not as 
concerned with the question of legitimation as Franklin seems to be, noting 
that experiments never have a strictly logical terminus, so that the decision to 
end experiment always involves risk.-Not surprisingly, Galison is more relaxed 
about Millikan’s missing drops, observing that Millikan’s pragmatism and 
experienced eye caused him to make decisions that we can recognize in 
retrospect as justified. Galison thus moves more towards the philosophical 
equipoise of theory and experiment suggested above, and a joint reading of 
Galison and Franklin is highly recommended as a way of becoming aware of 
the space opened up for consideration by what seem to be two major variants 
in the emerging new experimentalism. 

It has already been noted that Franklin attacks some extant philosophical 
opinion on the basis of his narratives. At times, some very sharp points against 
philosophical opinion are scored. For example, philosophers who suppose that 
increasing refinement of experimental technique and instrumentation will 
cause a gradual (or asymtotic) approximation of the true data values should be 
shocked by Franklin’s decisive counterexamples in which sudden large leaps to 
new values can be observed in experimental sequences that seemed to have 
been converging to ‘correct’ values. At times, there is an apparent fixation on 
the past of the philosophy of science. Franklin worries too much about the 
Quine-Duhem problem, which he would like to outflank by an elaborate 
Bayesian solution involving an experimentalist setting of the prior probabili- 
ties of auxiliary hypotheses. The Quine-Duhem problem is generated within 
the old philosophy of science by its reliance on the logical articulation of theory 
and observation. A more thorough reworking of philosophical concerns 
within the new philosophy ofreplacements suggested by Hacking would make 
the Quine-Duhem problems, as itis traditionally formulated, simply irrelevant. 
The models used to connect theory and data, or more properly theoretical and 
experimental activity, are not in general logical consequences of accepted 
statements of theory or data, but simplifications of complexity chosen by an 
adroit manipulation of simplifications against the background of the acce ted 
mathematical practices. 





Brit. J. Phil. Sci. 40 (1989), 191-211 Printed in Great Britain 


REVIEW ARTICLE 
The Philosophy of Quantum Mechanics 


The trouble with the philosophy of quantum mechanics, I'm told, is that as 
soon as you've found a position, you lose your momentum.’ Three very 
different books by philosophers on quantum mechanics have recently 
appeared in print. Peter Gibbin's Particles and Paradoxes is a splendid 
introduction to current problems in the field, suitable as an introductory text. 
Subtitled "The Limits of Quantum Logic', it concludes with a lengthy essay on 
what quantum logic can and can't do. Michael Redhead's book, Incomplete- 
ness, Nonlocality and Realism: A Prolegomenon to the Philosophy of Quantum 
Mechanics, covers much the same ground at an advanced level. Here, too, the 
aim is to acquaint the reader with the important new work that has emerged 
over the past twenty years or so, following the seminal results of Bell and of 
Kochen and Specker in the sixties. The book reflects the considerable 
contribution by Redhead himself to the debate on the foundations of quantum 
mechanics, and I would regard it as the definitive study of what we havelearnt 
about interpretative positions that no longer appear viable. The Metaphysics of 
Quantum Theory by Henry Krips has more ambitious aspirations, and claims to 
provide a new realist interpretation of quantum mechanics that resolves the 
measurement problem and other puzzles and paradoxes of the theory. I must 
confess to a certain feeling of awe for anyone who can juggle seventy one 
‘principles’ (usefully collected for the reader at the end of the book—count 
them!) but, mindful of the uncertainty principle that applies here, I am rather 
less sanguine about the merits of the interpretation. 

It is not hard to see why quantum mechanics is conceptually puzzling. The 
theory incorporates an algorithm (formulated in terms of the geometry of 
Hilbert space) for assigning probabilities to events, or propositions, or ranges of 
values of physical magnitudes (the dynamical variables or ‘observables’ of the 
theory). This algorithm resists the representation of the probabilities generated 
by the statistical states (quantum states) as measures over what might be 
termed ‘property states’: assignments of values to the physical magnitudes, or 
‘lists’ of properties, or assignments of truth values to the propositions of the 
system. 

Formally, such a property state might be represented by an ultrafilter in an 

1 I cannot, regrettably, claim the discovery of this important uncertainty principle for the 


philosophy of quantum mechanics. Historical research strongly indicates that the principle 
was first formulated by a graduate student at the University of Western Ontario, circa 1985. 


192 Jeffrey Bub 


algebra, or the atom generating the ultrafilter. For example, the algebra of 
physical magnitudes of a classical mechanical system is a commutative 
algebra of real valued functions on the position-momentum phase space of the 
system. The subalgebra of idempotent magnitudes is a Boolean algebra, 
isomorphic to the Boolean algebra of Borel subsets of phase space. The property 
state of the system is represented by a phase point (which corresponds to an 
atom in the Boolean algebra), or collection of sets to which the point belongs 
(this ts the ultrafilter of propositions generated by the point), or a listing of all 
the properties characterizing the system at a particular time. The statistical 
state is represented by a probability measure (in the standard Kolmogorov 
sense) over these atoms, or ultrafilters, or phase points, or lists of properties. 

A fundamental problem of interpretation arises for quantum mechanics 
because the theory apparently provides us with a set of states which are 
statistical states (represented by vectors or statistical operators in Hilbert 
space), without specifying any property states. Moreover, there are good 
grounds (Kochen and Specker, Bell) for supposing that there are no property 
states for quantum systems. The question is then: What do these statistical 
states mean if there are no property states, no lists of properties characterizing 
quantum systems at a particular time? 

The point is that you must have determinateness somewhere in the 
theoretical scheme for the probabilities to make sense. What we want is to 
relate the probabilities to a generalized ‘counting’ (in the measure-theoretic 
sense) over determinate possibilities, whether we interpret the probabilities 
epistemically, or as propensities, or what have you. Measurement results are 
determinate, so it is not surprising that the orthodox interpretation of 
quantum mechanics takes the probabilities as referring to the results of 
measurement. Philosophers have rightly resisted this move. The rival heresy 
postulates hidden variables that serve to parametrize determinate possibilities, 
There are lots of ways in which hidden variables might be introduced into 
quantum mechanics, but two variants of this proposal are relevant here. On 
the first variant, the hidden variables simply parametrize property states and 
the probabilities come out as measures over these property states. So a 
particular quantum system, characterized by a set of values for the hidden 
variables at a certain time, is also characterized by a ‘list’ of properties (an 
assignment of values to the physical magnitudes). On the second variant, the 
hidden variables either determine the results of measurements of the physical 
magnitudes (deterministic case), or fix the probabilities of measurement results 
(stochastic case). Here a particular quantum system, characterized by a set of 
values for the hidden variables at a particular time, is not characterized by a 
‘list’ of properties or assignment of values to the physical magnitudes. Rather, 
these quantum properties are said to be ‘indeterminate’ and only become 
determinate on measurement, whatever that means. 

Both variants are subject to restrictions which follow from the Kochen and 


The Philosophy of Quantum Mechanics 193 


Specker theorem (Kochen and Specker, [1967]).? The theorem says that in the 
case of a quantum mechanical system with a Hilbert space of three or more 
dimensions, there are finite sets of physical magnitudes for which no value 
assignments exist, if the assignments are required to preserve the functional 
relationships between the magnitudes which hold in virtue of thelr represen- 
tation as linear operators in the space. What this means for a system with a 
Hilbert space of three dimensions is this: Each maximal physical magnitude 
(each magnitude with three distinct possible values) is associated with an 
orthogonal triple of vectors corresponding to the different possible values of the 
magnitude. Each non-maximal magnitude (with only two distinct possible 
values) is assoctated with a vector and its orthogonal plane. For a suitably 
chosen set of magnitudes, it 1s Impossible to assign one and only one value to 
each magnitude (which amounts to a selection of one and only one vector in 
each orthogonal triple, or a choice between the vector and its orthogonal plane 
in the case of a non-maximal magnitude) in such a way that a certain 
‘meshing condition’ is satisfied. This is the condition that if two orthogonal 
triples of vectors associated with two maximal magnitudes coincide in a 
common vector, or if the vector in a vector-plane pair associated with a non- 
maximal magnitude coincides with one of the vectors in an orthogonal triple 
corresponding to a maximal magnitude, then any selection function that 
selects this common vector for one of the magnitudes must select the same 
vector for the other magnitude. So property states are definable only if they 
violate the meshing condition. 

One way to violate the meshing condition here is to replace each quantum 
property (each non-maximal magnitude with the two possible values 1 and 0) 
with a ‘fan of properties’, one for each choice of orthogonal vectors spanning 
the plane. Then a property state is definable on this expanded set of quantum 
properties. Thus, the first variant is possible only if we ‘de-Ockhamize’ the 
theory in this way. The consequence of the Kochen and Specker theorem for 
the second variant is contextualization: the set of properties that become 
determinate when a measurement is performed depends on the measurement 
context. Specifically, whether or not a property obtains after measurement 
depends not only on the hidden variables but also on specific features of the 
measurement process, insofar as the measurement yields a value for a 
particular 3-valued magnitude specifying a particular choice of orthogonal 
vectors spanning the Hilbert space plane associated with the property. 

If follows from Bell's result (Bell, [1964])? that the contextualization (or, for 


? For Redhead and Krips, this theorem plays a central role in the interpretation of quantum 
mechanics. Gibbins mentions it in passing (Gibbins, p. 124), in the context of some remarks on 
Gleason's theorem (Gleason, [1957]). He omtts to include a reference in the bibliography. 

3 This is the locus classicus of an extensive literature on the locality issue by Bell and others. For 
further references see any of the three books discussed here, especially the 'Notes and 
References' section at the end of Chapter 4 in Redhead. 


194 Jeffrey Bub 


that matter, the de-Ockhamization) has to be non-local. That is, if two systems 
are subsystems of a single composite system in a statistical state that is a linear 
superposition of subsystem statistical states, then whether or not a property 
obtains for a system over here (one might add: after measurement) can depend 
on what property obtains (after measurement) for a system over there, even 
though there is no longer any interaction between the two systems and the 
measurement events in question may be outside each other's light cones. 

All the conceptual problems of quantum mechanics—all the strange 
manifestations of quantum systems—have thetr origin in the way in which the 
statistical algorithm of quantum mechanics resists a representation of the 
probabilities as measures over determinate possibilities. 

Take, for example, the measurement problem, which has nothing to do with 
measurement. No experimenter faces this problem in the laboratory. It is a 
purely theoretical problem that arises when one tries to relate certain 
probabilities specified by quantum mechanics for a composite system, S+M 
(putatively representing a quantum system interacting with another quantum 
system, the measuring instrument), to a specific assumption about property 
states for S and M (call this the ‘orthodox’ property state assumption). The 
assumption in question—equivalent to Krips' principle (Bohr)—is that a 
property state for a quantum mechanical system 1s represented by a partial 
‘list’ of properties, viz. all those properties corresponding to subspaces that 
contain the vector representing the statistical state of the system. (All the 
propositions corresponding to subspaces containing the state vector are taken 
as true, all those corresponding to subspaces orthogonal to the state vector are 
taken as false, and the rest are regarded as nelther true nor false—it is in this 
sense that the list is partial.) 

We may dramatize the problem, following Schrodinger (Schrodinger, 
[1935]), by taking S as a spin-1/2 particle, say, and M as a macroscopic 
system, such as a cat. The cat, initially alive, is placed in a box with a device 
capable of killing the cat if triggered. What triggers the device is a quantum 
event, say the passage ofthe particle through a Stern-Gerlach magnet oriented 
in the z-direction, with two possibilities for the result, up (or -- 1/2) and down 
(or — 1/2). If the particle goes down, the device is triggered and the cat killed; if 
the particle goes up, the cat remains alive. If the initial state of S is a 
superposition of the two spin states, with equal probabilities for spin-up and 
spin-down, we would predict the demise of the cat with probability 1/2 (taking 
account of the appropriate interaction between the particle and the Stern- 
Gerlach magnet plus cat). The difficulty is that, according to the theory, after 
the particle has passed through the magnet, but before we open the box and 
look, the statistical state of the entire system, particle + cat (or, more precisely, 
particle + apparatus + cat) is described by a linear superposition with equal 
coefficients for {spin up, cat alive} and {spin down, cat dead}. By our 
supposition about property states, only those propositions corresponding to 


The Philosophy of Quantum Mechanics 195 


subspaces containing the state vector are true, and only those corresponding 
to subspaces orthogonal to the state vector are false. Since neither the 
proposition ‘the spin of the particle is up and the cat is alive’ nor the proposition 
‘the spin of the particle is down and the cat is dead’ correspond to subspaces 
which contain the state vector or are orthogonal to the state vector, neither of 
these propositions is true or false. 

So, on this proposal, prior to opening the box, the cat must have made a 
transition from a property state of being alive (determined by the initial 
statistical state of the cat) to a state of limbo, neither alive nor dead. This is 
clearly absurd. A further transition, from the indeterminate state of limbo to a 
determinate state of ‘alive,’ or a determinate state of ‘dead’ (exclusively), ts 
required to account for what we actually find when we open the box and 
look—and this is not explained by the theory. There is no way of avoiding this 
conclusion, essentially because of the linearity of state transitions in the theory 
and the way in which the coupling between interacting systems is represented 
in quantum mechanics. The measurement problem illustrates the difficulty 
inherent in defining a property state for quantum systems as a partial ‘list’ of 
properties, leaving the rest indeterminate—what Schrodinger referred to as a 
‘blurred model’? of reality. 

The whole matter can be put rather more technically in terms of the 
distinction between 'pure' and 'mixed' (statistical) states in quantum mecha- 
nics. A pure statistical state is represented by a vector or 1-dimensional 
subspace in Hilbert space. A mixed statistical state (or mixture) ts represented 
by a probability distribution (in the usual Kolmogorov sense) over pure 
statistical states, formally by a non-idempotent statistical operator in Hilbert 
space. One might suppose that a system specified by a mixed state of, say, 5096 
‘spin up in the z-direction’ and 50% ‘spin down in the z-direction’ would be 
either in the pure statistical state 'spin up in the z-direction' with probability 
1/2, or in the pure statistical state 'spin down in the z-direction' with 
probability 1/2 (exclusively), but this 1s not so. It turns out that the statistics 
characterized by this mixed state is identical with the statistics characterized 
by mixing other spin states (pure statistical states) in various proportions, e.g. 
50% ‘spin up in the @-direction’ and 50% ‘spin down in the 6-direction’, for any 
0, or certain other proportions of non-orthogonal spin states. So a straight- 
forward ‘ignorance interpretation’ of mixtures cannot be maintained without 
qualification. In the spectal case of a spin-1/2 system (represented on a 2- 
dimensional Hilbert space), there is actually no logical contradiction involved 
in supposing that all the spin propositions have determinate truth values, so an 
ignorance interpretation coupled with the orthodox property state assumption 
is in fact possible. But for a spin-1 system, or in the general case, for quantum 
systems whose statistical states are represented in three or higher dimensional 
Hilbert spaces, it is not possible to interpret a mixture as a classical probability 

* In (Schrodinger, [1935]). Quoted from the translation in (Wheeler and Zurek, [1983]). p. 157. 


196 Jeffrey Bub 


distribution over determinate property states (via the probability one assign- 
ments of the pure statistical states in the mixture). 

Now suppose S is in a pure statistical state represented by a Hilbert space 
vector that is a linear superposition of 'spin up in the z-direction' and 'spin 
down in the z-direction’, with coefficients c; and cz not equal to 1 or 0, and M is 
in some suitable 'ready state' (a pure statistical state representing the zero 
‘pointer reading’ of M). It can be shown that there exists an interaction that 
transforms the product state (representing the initial statistical state of the 
composite system $-- M) to a new pure statistical state (the state of the 
composite system after the measurement interaction) that is a linear 
superposition of product states: 'spin up in the z-direction' for S correlated with 
a pure statistical M state (‘pointer reading’ corresponding to ‘up’), and ‘spin 
down 1n the z-direction' for S correlated with another pure statistical M state 
(‘pointer reading’ corresponding to ‘down’). Moreover, this linear superposi- 
tion has the same coefficients c, c; as the initial state of S, so the statistical state 
of the composite system assigns the same probabilities to the S properties and 
M properties labelling the product states (‘spin up in the z-direction’ and 
‘pointer reading up’; ‘spin down in the z-direction’ and ‘pointer reading down’) 
as the initial S state assigned to the properties spin up and spin down in the z- 
direction, respectively. But this composite system pure state does not licence 
the inference that either S has the property 'spin up in the z-direction' and the 
M 'pointer reading' corresponds to 'up', or (exclusively) S has the property 
‘spin down in the z-direction’ and the M ‘pointer reading’ corresponds to 
'down'. 

In fact, the orthodox property state assumption licences the inference that 
certain specific composite system propositions are true (all those correspond- 
ing to subspaces in the Hilbert space of the composite system that contain the 
state vector), certain others false (those corresponding to subspaces orthogo- 
nal to the state vector), and the rest indeterminate, so the propositions in 
question (which are assigned probabilities |c;|2,|c;|? not equal to 1 or 0) are 
neither true nor false for the composite system pure state. What would 
apparently licence the inference that these properties are either true or false is a 
statistical state for S+M that is a mixed statistical state, representing a 
probability distribution with weights |c;|?,|c;|^ over the same pure statistical 
product states as the linear superposition. I say 'apparently' because this 
inference is still subject to difficulties with the ignorance interpretation of 
mixtures and the orthodox property state assumption. 

The measurement problem is usually put this way, at least since von 
Neumann's formulation in The Mathematical Foundations of Quantum Mecha- 
nics (von Neumann, [1955]): How do we explain the fact that measurement 
interactions in quantum mechanics must apparently transform pure statistical 
states into mixed statistical states, while all other interactions always 
transform pure statistical states to pure statistical states? What we apparently 


The Philosophy of Quantum Mechanics 197 


want as the result of a measurement interaction is a (suitable) mixture W. The 
closest thing we can get is a similar looking pure state ¥ (with the right sorts of 
correlations between S states and M states, and the right coefficlents to yleld 
the probabilities we expect), because the theory tells us that there do not exist 
interactions which transform initial pure states into mixtures (all time 
evolutions in quantum mechanics are unitary, and no unitary transformation 
will do the trick). In his formulation of quantum mechanics, von Neumann 
introduced a new postulate, the ‘projection postulate’, to cover measurement 
transitions to mixtures or, equivalently, the transition from a pure statistical 
state that is a linear superposition over eigenstates of the physical magnitude 
measured to the eigenstate corresponding to the measurement outcome. The 
measurement problem then becomes the problem of justifying the role of the 
projection postulate in the theory. There is an accepted standard form of the 
projection postulate for measurements of maximal physical magnitudes, but 
various formulations of the postulate have been proposed for measurements of 
non-maximal physical magnitudes and for magnitudes with continuous 
spectra (e.g. von Neumann's rule (von Neumann, [1955]), the Luders rule 
(Luders, [1951])). 

Now consider the composite system pure statistical state ¥ that is the closest 
thing we can get to the mixture W that we apparently want. It can be shown 
that S separately and M separately are represented by mixed statistical states, 
classical probability distributions of pure statistical S states ('spin up in the z- 
direction’, ‘spin down in the z-direction’) and pure statistical M states (‘pointer 
reading up’, ‘pointer reading down’), respectively, with the same probabilities 
as the composite system mixed state W. The temptation is to say that each 
system, S and M separately, is determinately in some or other pure statistical 
state, with the mixed state probabilities interpreted epistemically. Coupled 
with the assumption about property states, one might suppose that this feature 
of quantum mechanics (the reduction of a composite system state into 
subsystem mixed states) allows one to infer that the system S in a 
measurement interaction is determinate with respect to the S-properties 
measured and the measuring instrument M is determinate with respect to the 
M-properties (‘pointer readings’) correlated with the S-properties measured 
via the measurement interaction (insofar as it can be shown that the pure 
statistical state of the composite system reduces to mixtures of S-states and M- 
states in the required way). 

Of course, this is not a solution of the measurement problem for a variety of 
reasons. The simple ignorance interpretation of mixtures doesn't work. Even if 
it did work, the Kochen and Specker theorem would rule out using the 
ignorance interpretation, together with the orthodox assumption about 
property states, to Interpret the mixture as a classical probability distribution 
over determinate property states in general (when the probabilities are all 
equal, and in certain other cases as well, these property states would have to 


198 Jeffrey Bub 


violate the meshing condition). Finally, an important distinction is ignored, 
that between mixtures Ws and Wy obtained by reducing the statistics of a pure 
statistical state ¥ for S-- M to S and M separately, and the same mixtures Ws 
and Wy obtained by reducing the statistics of an appropriate mixed statistical 
state W for S+ M to S and M separately. For the composite system mixed state 
(the statistical state W we apparently want for the composite system after the 
measurement interaction) defines a statistics for the composite system that is, 
in principle, distinguishable from the statistics defined by the composite system 
pure state (the pure statistical state that is the closest thing we can get in the 
theory to the mixed state). 

In recent years, philosophers have been more concerned with problems of 
locality and completeness than with the measurement problem. In the 1960's 
Bell extended the Einstein-Podolsky-Rosen argument that quantum mecha- 
nics is either nonlocal or incomplete to show, rather surprisingly, that the 
incompleteness horn of the dilemma does not avoid nonlocality either. 

Consider again the composite system S+ M, only now suppose that both S 
and M are spin-1/2 systems moving in opposite directions, left and right, and 
call the systems S; and Sz. Measuring instruments M; and Mp capable of 
measuring spin In different directions are placed in the paths ofthe two systems 
at a suitable distance apart. Represent the spin magnitudes by A, A’, . . . for M; 
and B, B’,...for Mg, and the values of these magnitudes by a4, a_; a',, 
a_;...and b,, b_; b+, b’_;..., respectively. 

Bell’s argument is this: Suppose that quantum mechanics is incomplete in 
the (weak) sense that there are hidden variables 4 which fix the probabilities of 
measurement results (presumably more precisely than quantum mechanics 
Itself), so that, for example, 

p^*(a&b) = (p;^P(a&b)dp(A) 

where p(A) is the probability distribution of hidden variables yielding the 
quantum statistics, and a, b are variables ranging over A-values and B- 
values. The superscripts AB serve to indicate that M; and Mg are set to measure 
A and B, respectively. This notation is redundant for the joint probability, but 
not for the marginal probabilities  p;^P(a) — p;^P(a&b., ) - p;^P(a€»b..), 
pi?P(b) =p; (a&b) - pj^?(a..&b), i.e. p (a), might not be equal to p;4? (a), 
and so on, and it may even be the case that none of these quantities are equal to 
pi^(a), the probability of the outcome a for a measurement of A by the 
instrument M; when no instrument Mg is placed in the path of Sg (or when Mg 
is switched off). 

Suppose there are statistical correlations between M; measurements and Mz 
measurements which manifest themselves when we consider many such pairs 
of systems, all prepared in the same composite system pure state, i.e. suppose 
that p^? (a&b) + p^P(a)p^*(b), and similarly for combinations of other pairs of 
physical magnitudes (AB', A'B', etc.). If we assume (‘outcome independence’) 
that 


The Philosophy of Quantum Mechanics 199 


Di P(a|b) — p;^P(a) 
for all values a of A and b of B, and for all pairs of physical magnitudes (and 
similarly for conditionalization on M; measurement outcomes), and also 
(‘parameter independence’) that 


pi^? (a) =p,4(a) 


for all values of a of A and all spin measurements B,B', . . . , of Mr (and similarly 
for changes of the spin magnitude measured by Mj), then since 


p? (a&b) = p;^P(a|b)p;^9(b) 
it follows that 


pi^ P (a&b) = p;^(a)p;*(b) 
Le. the superscripts become redundant and we may write 


pi(a&b) = p,(a)p,(b) 

` The latter conditional statistical independence requirement has been termed 
(strong) locality (or ‘factorizability’) in the literature. Its unpacking as two 
distinct conditions, outcome independence and parameter independence, is 
due to Jarrett (Jarrett, [1984]), who terms outcome independence ‘complete- 
ness’ and parameter independence ‘locality’. Outcome independence says that 
A screens off Mj, measurement outcomes from Mg measurement outcomes. 
Parameter independence says that A screens off M; outcomes from what 
physical magnitude (spin magnitude) is measured by Mg. It is fairly easy to 
show that if we assume both conditions an equality is derivable which is 
violated by the quantum statistics. Consider the expression: 


K(A) = p,(a)[pi(a’)(1 — pa(b)) + (1 — pi(a))(1 —p,(b’)] 
(1 —p,(a))[pi(a’)pa(b’) + (1 —p,la’))pa(b)] 
Since K(A) is a convex combination of two terms, each of which is a convex 
combination of terms which take values between O and 1, O<K(A)<1. 
Averaging over A preserves the inequality which, assuming conditional 
statistical independence, becomes: 


O & p(a) + p(b) + p(a’&b’) — p(a&b) — p(a'&b) — p(a&b’) <1 


This is known as the Clauser-Horne inequality (Clauser and Horne, 
[1974]), one of a family of related inequalities first derived by Bell. Since there 
exist pairs of systems S, and Sa with statistics specified by pure states of the 
composite system S;--S& which violate this inequality for certain spin 
directions A,A’, B,B’, it follows that either the hidden variables do not screen off 
M; measurement outcomes from Mg measurement outcomes, or they do not 
screen off M; outcomes from the spin magnitude measured by Mg. The first 
possibility involves a kind of non-separability between the two systems that 
Shimony has termed ‘passion at a distance:' the two systems are not separately 
'autonomous' with respect to their properties. The second possibility involves a 


200 Jeffrey Bub 


more straightforward sense of action at a distance: the conditional probabili- 
ties of M; measurement outcomes can be altered by changing the spin 
magnitude measured by Mg. 

So much for a brief overview of the core conceptual problems of quantum 
mechanics. All three authors deal with these problems, but while Gibbins and 
Redhead try to get the problems right, Krips is concerned with a specific 
proposal for their solution. 

Gibbin's book is divided into two parts. Part I covers the early history of the 
philosophy of quantum mechanics (pre-Kochen-and-Specker, pre-Bell). There 
is a brief account of quantum mechanics for philosophers, followed by an 
illuminating discussion of wave-particle duality, the uncertainty principle, 
complementarity, Einstein versus Bohr, the Einstein-Podolsky-Rosen argu- 
ment, and Popper on realism and 'The Great Quantum Muddle' (the thesis— 
formulated in (Popper, [1967])—that the conceptual problems of quantum 
mechanics are just a ‘muddle’ originating in some fairly elementary miscon- 
ceptions about probability). Part II begins with a very nice exposition of Hilbert 
space quantum mechanics and moves on to the measurement problem, the 
locality issue, and a brief nod at the Kochen and Specker theorem. The book 
concludes with two final chapters on quantum logic. The first develops a ‘user- 
friendly' natural deduction system for quantum logic, similar to Lemmon's 
system for classical logic. The second evaluates the extent to which the 
quantum logical interpretation of quantum mechanics can be taken as 
resolving the conceptual problems of the theory. 

The latter chapter is essentially a critique of Putman’s ‘The Logic of 
Quantum Mechanics' (first published in (Putnam, [1969]) as 'Is Logic 
Empirical?’). The argument here is that quantum logic can only be explana- 
tory in a weak sense: using quantum logic, we avoid the paradoxical 
conclusions associated with nonlocality and interference, but we do not 
thereby show that the probabilities generated by the statistical algorithm of 
quantum mechanics ought to come out the way they do. Actually, Gibbins 
shows that Putnam's method of blocking the derivation of the wrong statistical 
pattern in the 2-slit experiment by appealing to non-distributivity won't work. 
The right approach, he says, is to see that there is no ‘sensible’ conditional 
probability in the quantum probability calculus, and that quantum logic 
therefore prevents the derivation of a puzzle about probabilities in the 2-slit 
experiment. Similarly, Bell's inequality is not derivable quantum logically. 
‘This is not to say that quantum logic removes the paradoxes of nonlocality. 
Rather, quantum logic merely has them built-in, so they are not quantum 
logical paradoxes.’ (Gibbins, p. 162.) Gibbins goes on to show that the 
quantum logician can have an ignorance interpretation of mixtures without 
the usual problems, and produce a derivation of the Luders version of the 
projection postulate, but this does not solve the measurement problem. ‘What 
does this tell us? Not that the Luders rule is explained, or that it is not ad hoc. It 


The Philosophy of Quantum Mechanics 201 


tells us that the Luders rule is represented in quantum logic by a particularly 
attractive material conditional which ts well-behaved. But the Luders rule is no 
more ad hoc than our material conditional.... Of course, quantum logic 
cannot explain why there should be a projection, only that if there is one, and if 
it is minimally disturbing, then it is well-described by a well-behaved quantum 
logical conditional.’ (Gibbins, pp. 164-5.) 

Now this is all very well as a critique of Putnam’s early views on non- 
Boolean quantum logic, but it leaves entirely out of consideration important 
later work on the subject by Putnam that substantially alters the picture 
painted here (Friedman and Putnam, [1978]; Putnam, [1981]), to say nothing 
of the application of Kochen and Specker's quantum logic in the further 
development of the thesis that quantum mechanics is to be understood as a 
‘principle theory’ (in Einstein's sense)? of logical structure, by Friedman, 
Demopoulos, Stairs, and (unblushingly) myself. Just as Einstetn regarded 
relativity theory as a new principle theory of geometric structure (relative to 
Newtonian theory), Le. as a theory that imposes broad, abstract, structural 
constraints on events (here the structural constraints are geometric), this 
thesis regards quantum mechanics as a new principle theory of logical 
' structure (here the structural constraints on events are logical). 

In his 1968 paper on quantum logic, Putnam characterized the difference 
between classical logic and quantum logic as a failure of distributivity. This is a 
very weak characterization of the new logical structure, and in many ways it is 
quite misleading if taken apart from the context of the later development of the 
thesis. It is just because Gibbins confines his evaluation of quantum logic to a 
characterization in terms of what the structure lacks relative to classical logic, 
that he draws the conclusion that the quantum logical thesis is non- 
explanatory. 

The structure of quantum logic is not merely non-distributive. In fact, non- 
distributivity turns out to be something ofa red herring. In terms ofthe Kochen 
and Specker analysis, quantum logic has a very specific sort of non-Boolean 
structure: it is, in the general case, a partial Boolean algebra that is not 
imbeddable into a Boolean algebra. Here it is useful to think of a logic in terms 
ofits Lindenbaum-Tarski algebra. The Lindenbaum-Tarski algebra of classical 
logic (if you like, the propositional logic of classical mechanics) is a Boolean 
algebra (the propositional logic of a classical mechanical system is isomorphic 
to the power set algebra generated by the Borel subsets of the position- 


5 See (Einstein, [1949]). p. 53 and (Bub and Demopoulos, [1974]), p. 92. 

é See (Bub, [1976a], [19765], [1977], [1981], [1982]), Bub and Demopoulos, [1974]). 
Demopoulos, [1974], [1976], [1977]), (Stairs, [1982], [1983a], [1983b]). I leave out of 
account the wide variety of different approaches to quantum logic (modal, non-modal, etc.) as 
pursued in Geneva (Jauch, Piron) Marburg (Ludwig, Neumann), Cologne (Mittelstaedt, 
Stachow), Amherst (Foulis, Randall), and in the work of van Fraassen, Gudder, Greechie, 
Hardegree, Dalla Chiara, Beltrametti, Casinelll, and many others. In a work subtitled ‘The 
Limits of Quantum Logic’ one might have expected some recognition of this embarasse de 
richesse, with an indication of the author's orientation, even granted that the account is 
intended to be Introductory. For a survey and references, see (Hooker and Holdsworth, [1983]). 


202 Jeffrey Bub 


momentum phase space of the system). The Lindenbaum-Tarski algebra of 
quantum logic looks like a lot of Boolean algebras ‘pasted together’ in a certain 
complicated way (ie. certain elements in the Boolean sub-algebras are 
identified). The Boolean non-embeddability of the entire structure (as opposed 
to the mere non-distributivity of the structure) excludes the introduction of 
hidden variables (no extension is possible in which property states are 
definable as Boolean ultrafilters). If the structure is characterized as a partial 
Boolean algebra (as opposed to a non-Boolean lattice) then it even turns out 
that the distributive law is valid in quantum logic. This involves a natural 
extension of the notion of classical validity (c-validity) to quantum validity (q- 
validity) by extending the set of structures over which validity is defined from 
Boolean algebras to partial Boolean algebras. While the distributive law is q- 
valid (valid in all partial Boolean algebras), there exist classical tautologies 
which are not q-valid, and also classical contradictions which are q-satisfiable. 
The picture is roughly this: Suppose we regard the valid propositional 
functions as invariants characterizing the logical structure of events, just as 
the invariants of a group of geometric transformations characterize the space- 
time structure of events. (For example, in an algebraic formulation of classical 
logic, the classical tautologies specify all Boolean functions that yield the unit 
element in a Boolean algebra when elements of the algebra are substituted for 
the variables—in effect, the tautologies represent invariants under a group of 
logical automorphisms.) The Boolean embeddability of the partial Boolean 
algebra of a quantum mechanical system with a 2-dimensional Hilbert space 
means that this logical structure is essentially Boolean. All the classical 
tautologies are valid in this partial Boolean algebra, as well as some 
propositional functions that are not valid. This means that we can introduce 
hidden variables in the Kochen and Specker sense for such systems, i.e. we can 
define property states for such systems (even though this partial Boolean 
algebra is non-distributive when considered as a lattice). The classical 
tautologies will remain valid under an extension of the structure to a Boolean 
algebra, while those propositional functions that are valid in the structure but 
are not classical tautologies will no longer map all sequences in the Boolean 
algebra onto the unit. In the case of logical structures associated with three 
and higher dimensional Hilbert spaces, some classical tautologies are not valid 
(in the relevant logical structure). Such logical structures are essentially non- 
Boolean: there are no possible extensions of these structures to Boolean 
algebras, because all classical tautologies are valid in every Boolean algebra, 
whtle some propositional functions which are classical tautologies do not map 
every sequence of elements in these partial Boolean algebras onto the unit. 
The quantum logical thesis becomes explanatory to the extent that it can be 
shown that just the right sort of probability theory is generated by the 
particular logical structure imposed by quantum mechanics as constraints on 
events. There are various results that bear on this question by Gleason, Piron, 


The Philosophy of Quantum Mechanics 203 


and others.’ In particular, it can be shown that the projection postulate (in 
fact, the Luders rule for non-maximal measurements) is just the conditionali- 
zation rule in quantum logic that is the analogue of classical conditionaliza- 
tion.? So we have a clear strategy for resolving the measurement problem in 
terms of an explanation of quantum interference that is not merely negative in 
character: the analysis yields the right probabilities and does not just serve to 
block the wrong inferences (for example, the inference that would yield the 
wrong distribution in the 2-slit experiment, as in the Putnam analysis 
criticized by Gibbins). 

Of course, there are serious problems with quantum logic as an explanatory 
thesis, but these are not engaged by the narrow focus of Gibbins’ discussion. 
The essay on quantum logic is the weakest component of Gibbin’s book, both 
in the historical and in the critical sense. It is even uncharitable as a critique of 
limited scope applied to a particular stage in Putnam's thinking on the subject, 
because it falls to mention other work by Putnam that has been part of the 
literature on the subject for some time now.? In all other respects, though, the 
book is an admirable introduction to the debate on the foundations of quantum 
mechanics and can be very usefully studied as a course text. 

There is a far superior discussion of quantum logic in Redhead's book (the 
last chapter). Here we have a clear characterization of what is meant by the 
proposal that the logic of quantum mechanics is non-Boolean, in terms of a 
generalization of the classical notion of validity over a set of non-Boolean 
algebraic structures generated by Hilbert space projection lattices. Ultimately, 
Redhead is skeptical. He shows in what sense one may assert, with Putnam, 
that ‘every observable has a value, but there is no value which it has!’ 
(Redhead, p. 166) and goes on to ask (Redhead, p. 167): ‘But is this sort of talk 
genuinely helpful in understanding QM? It may be argued that, far from 
helping to resolve the mysteries of QM, it merely substitutes one mystery for 
another, viz. how to make sense of the sort of slogans we have been repeating.’ 
The Kochen and Specker theorem seems to preclude the possibility of 
introducing a realist semantics for quantum logic in a sufficiently robust sense 
to resolve the conceptual puzzles of quantum mechanics. ‘But if we don’t do 
that, can we really be sald to be retaining realism.’ (p. 167) 

This is a clear and cogent critique of the realist version of the quantum 
logical interpretation. A more complete account would have included a 
discussion of non-Boolean probability theory (the classical Kolmogorov 
axiomatization is a Boolean theory), in particular non-Boolean conditionaliza- 
tion, since the explanatory efficacy of quantum logic in resolving problems of 
interpretation derives largely from the ability of the logic to generate the right 


7 (Gleason, [1957]), (Piron, [1976]), (Varadarajan, [1968]). 

8 See (Bub, (1977]). 

9 Gibbins mentions (Putnam, [1974]), but not (Friedman and Putnam, [1978]), (Putnam, 
[1981]. 


204 Jeffrey Bub 


statistics. In any event, the problems raised by Redhead are only sharpened by 
a detailed consideration of probabilities. My own view is that the realist 
quantum logical interpretation remains a very interesting and viable research 
programme, with real successes in spite of some undeniable failures. After all, 
we make quantum mechanics by imposing certain commutation relations on 
the commutative algebra of dynamical variables of classical mechanics, i.e. we 
replace the commutative algebra of physical magnitudes of classical mecha- 
nics by a non-commutative algebra. The algebra of physical magnitudes is 
generated by the sub-algebra of idempotent physical magnitudes, the 
magnitudes with possible values 1 and O, and this sub-algebra is the 
Lindenbaum-Tarski algebra of the logic (Boolean in the case of a commutative 
algebra of magnitudes, non-Boolean in the case of a non-commutative 
algebra). If we are forced to conclude that a retrenchment from a full-blown 
realism is required by the transition from classical mechanics to quantum 
mechanics, this would be a philosophical insight of some importance. 

The thrust of Redhead’s book concerns the importance of the Kochen and 
Specker theorem and Bell's locality argument, not quantum logic. There is an 
excellent introduction to the formalism of quantum mechanics in the Dirac 
notation, followed by a discussion of problems of interpretation in terms of 
three views, labelled A, B, and C, concerning what one can say about the value 
of a physical magnitude or quantum mechanical observable Q when the state 
of the system is not an eigenstate of Q. A is the view that Q has a sharp but 
unknown value (labelled the hidden variable interpretation), B ts the view that 
Q has an unsharp or 'fuzzy' value (labelled the propensity or potentiality 
interpretation), and C is the view that the value of Q is undefined or 
'meaningless' (labelled the complementarity interpretation). I have never 
understood what it means to say that a physical magnitude has an unsharp 
value, for example that an electron has an unsharp position (it does not mean 
that the electron is a cloud), and I don't see the three views presented here as a 
useful characterization of the major strands underlying rival interpretations of 
the theory, or even as succinct characterizations of the hidden variable 
interpretation vs. the propensity interpretation vs. the complementarity 
interpretation. The distinction serves as a paedagogical device to sharpen the 
analysis of the Einstein-Podolsky-Rosen completeness argument in Chapter 3 
and the Bell locality argument in Chapter 4. Redhead distinguishes five senses 
of locality and shows that, on the basis of the violation of Bell’s inequality, 
quantum mechanics entails a violation of different senses of locality appro- 
priate to the different views A, B, and C. There is a brief but useful discussion of 
experimental tests of the inequality and of alternative forms of the inequality. 
The chapter on Bell is followed by a full discussion of the Kochen and Specker 
theorem, with an Informative justification of some of the assumptions 
underlying the theorem. All in all, the two chapters on the locality issue and 


The Philosophy of Quantum Mechanics 205 


the Kochen and Specker no hidden variables theorem are by the far the best 
expository accounts I have seen in print. 

Chapter 6 is essentially the Heywood and Redhead argument (Heywood and 
Redhead, [1983]) that a realist interpretation of quantum mechanics is 
' committed to denying either ‘ontological locality’ or ‘environmental locality’, 
at least if some apparently innocent principles are assumed for possessed 
values of observables. The investigation originated with the following issue: As 
pointed out above, one way of avoiding the Kochen and Specker argumentis to 
split non-maximal physical magnitudes into a ‘fan’ of magnitudes, one for 
each maximal magnitude of which the non-maximal magnitude 1s a function. 
Consider two spatially separated spin-1/2 systems, Sı and S2, as in the Bell 
argument. The statistics of the composite system is represented on a 2 x 2- 
dimensional Hilbert space. S,-magnitudes are non-maximal in the composite 
Hilbert space, but maximal in the 2-dimensional Hilbert space of S;. Similarly 
for S2. Thus, $;-magnitudes may be said to be ‘locally maximal’. Now, there isa 
result by Maczynski (Maczynski, [1971]) that says that if we consider only 
maximal physical magnitudes, then there is no contradiction in assigning one 
and only one value to each maximal magnitude, in such a way as to preserve 
all functional relationships which hold among the magnitudes (as specified by 
the algebra of magnitudes). 

The question arises as to whether Maczynski's result can be extended to 
locally maximal magnitudes as well. If not, i.e. if it could be shown that every 
extension of a value assignment to the maximal magnitudes involves a 
contradiction when extended to the locally maximal magnitudes, then we 
would have a purely algebraic proof of non-locality, i.e. a purely algebraic 
proof of Bell's result. As it turns out, this cannot be shown: we need more 
magnitudes than just the maximal magnitudes and the locally maximal 
magnitudes to generate a contradiction in value assignments preserving 
functional relationships. So there is no purely algebraic proof of non-locality 
along these lines. The closest thing in the literature to such a proof is provided 
by the Heywood and Redhead argument. A violation of ontological locality 
means that we can't specify the properties of S; independently of the properties 
of the composite system Sı + S2 (cf. ‘outcome independence’, Jarrett's ‘com- 
pleteness’ above for stochastic hidden variable theories). Under the assump- 
tion of ontological locality, a violation of ‘environmental locality’ means that a 
property of S, can be altered by changing the setting of a measurement 
apparatus interacting with S5, ie. by changing the kind of measurement 
performed on S; (cf. ‘parameter independence’, Jarrett's ‘locality’ above for 
stochastic hidden variable theories). 

What Redhead's book lacks is a detailed critical analysis ofthe measurement 
problem, but this is hardly a criticism. The preliminary discussion in Chapter 2 
concludes with the disclaimer (Redhead, p. 56): ‘What we shall do in this book 
is to assume that the measurement problem has been solved, for example 


206 Jeffrey Bub 


simply to ignore the objections to the “two time evolutions” solution. We are 
going to concentrate on other interpretative problems of QM.’ Evidently, the 
purpose of the book ts to explore the difficulties raised for a realism of possessed 
values by the Kochen and Specker theorem and the locality problem. The 
achievement is a thoroughly convincing argument for the conclusion that 
‘some sort of action-at-a-distance or (conceptually distinct) nonseparability 
seems built into any reasonable attempt to understand the quantum view of 
reality’ (Redhead, p. 169). Notes and references at the end of each chapter and 
an appendix on vector spaces and lattices enhance the book's value as a course 
text, at a level of sophistication considerably above that of the Gibbins book. 

Redhead's book is characterized as a 'prolegomenon to the philosophy of 
quantum mechanics'. It develops a serles of interrelated problems that any 
realist interpretation worthy of the name will have to face, without proposing 
a solution. By contrast, Krips purports to have a realist interpretation that 
resolves these problems, as well as the measurement problem. 

First, a brief summary. An Introduction indicates that the core idea behind 
the new interpretation is a principle, termed (Bohr)'!/9, for attributing 
determinate values to physical magnitudes. (Bohr)' is a generalization of the 
principle (Bohr), equivalent to what I called the 'orthodox' property state 
assumption above. The first chapter is on Bohr and contains a lengthy analysis 
of the concept of indeterminacy. The conclusion that an indeterminate 
position in quantum mechanics is 'a new sort of property, different from that of 
having any particular position(s) precisely, but theoretically respectable in its 
own right', (Krips, p. 34) is hardly illuminating. 'What has happened is that 
the concept of indeterminacy has itself been precisifled and incorporated into 
theoretical discourse. Thus the programme of eliminating indeterminate (in 
the broad sense) terminology has actually been furthered by the paradoxical 
device of including the term “indeterminate” within QT.’ (Krips, p. 35) Oh, 
come now! 

The discussion of state vectors and probabilities in Chapter 2 formulates a 
propensity interpretation and concludes with a discourse on the state vector as 
a single-system property. There is an appeal here to 'Hacking's criterion of 
reality: an entity is real if it can be kicked (manipulated) and kick back (have 
effects)’. (Krips, p. 60.) I always thought this came from Popper out of Lande.!! 
Chapter 3 outlines Krips’ modification of the pilot-field view. The idea here is 
that a quantum system is a particle piloted by a probability field. Krips takes the 
location of the particle in the field to be indeterminate in order to avoid 
difficulties with the ‘no go’ hidden variable theorems. Chapter 4 deals with the 
distinction between pure and mixed states, the ignorance interpretation of 
mixtures, and the notion of a subsystem reduced state with respect to a 
composite system state. The heart of Krips’ realist interpretation is developed 
10 This principle is misprinted as (Bohr)—a distinct principle—on p. 3. 

11 See (Popper, [1967]), p. 33. 


The Philosophy of Quantum Mechanics 207 


here. It involves accepting the principle labelled (Bohr)’, and various 
equivalents, and rejecting the converse. This means that if a quantum system 
is in a mixed state represented as a probability distribution over eigenstates of 
some physical magnitude, and these eigenstates correspond to distinct 
eigenvalues, then the system has a determinate value for that magnitude, and 
this determinate value is one of the eigenvalues corresponding to the 
eigenvectors in the mixture. What we can't infer is that if a system has a 
determinate but unknown value for a physical magnitude, then the state of the 
system is a mixture over the corresponding eigenstates. 

Chapter 5 applies this idea to the measurement problem and to the Einstein- 
Podolsky-Rosen argument. It is easy to see how the measurement problem 1s 
resolved. Recall the earlier discussion in terms of the system S+ M. IfS+ Misin 
the pure state ¥ that is the closest thing we can get to the mixed state W that 
we apparently want after a measurement interaction, then S separately and M 
separately are represented by mixed statistical states, classical probability 
distributions over pure statistical S-states (the eigenstates of the physical 
magnitude measured) and pure statistical M-states ('pointer reading' states), 
with the same probabilities as the composite system mixed state W. By the 
principle (Bohr)’, S has a determinate value of the magnitude measured, and M 
has a determinate pointer reading. By fiat, i.e. by denying the converse of 
(Bohr)', we simply block the route backwards, so that we cannot infer a mixed 
statistical state for the composite system from the presumption that $ and M 
are determinate with respect to the physical magnitudes correlated by the 
measurement interaction. What this means is that the mixed statistical state 
W that we apparently want after the measurement interaction is ultimately 
motivated by assuming the converse of (Bohr)'. If we reject this assumption, 
then what we really want is the associated pure state, and this is what we get 
from the theory. Quod erat demonstrandum. 

The following chapters, on realism, on Kochen and Specker, on the Stapp 
and Eberhard proof of a version ofthe Bell inequality, and on the Heywood and 
Redhead argument, are somewhat anti-climactic after this. Suffice it to say 
that Redhead does it better. Much of the discussion is in code (constant 
reference to the list of 71 principles is required here). For example: ‘More 
importantly, however, it means that the arguments ofthe previous section can 
betaken immediately to show that realism, qua the principles (Det Q) and (Pass 
QY, contradict a QT which has been standardly de-Ockhamized (and hence 
includes (C1)). This is because from section 1 we see that QT, including (VR)’ 
plus (C1) and (Pass Q)', imply (VR), and from section 2 we then see that (VR) 
plus QT and (Det Q) contradict BWOLOC’. (Krips, p. 203.) 

The argument concerning realism is simply that, while quantum mechanics 
may require us to live with the fact that physical magnitudes need not have 
determinate values, or that ideal measurements need not yield the possessed 


208 Jeffrey Bub 


values of physical magnitudes just prior to measurement, this in itself is no 
reason for rejecting realism in some broad sense. 

Aside from an annoying tendency to multiply principles and formal 
complexities without much relevance to the conceptual problems at issue, the 
real failure of the book is that the proposed realist interpretation is simply a 
non-starter. 

Why? Well, the tdea ts essentially to resolve the measurement problem by 
accepting the principle (Bohr)’ while rejecting the converse. Now, in certain 
cases the application of this principle will yield a set of determinate quantities 
that is extensive enough to involve a contradiction with the Kochen and 
Specker theorem. This will occur, for example, ifthe reduced statistical state for 
Sis proportional to the unit operator. Then every quantity is determinate. If S is 
a system associated with a Hilbert space of three or more dimensions, then S 
can have determinate values for all physical magnitudes only if every non- 
maximal magnitude is completely de-Ockhamized. 

Krips recognizes this (‘Basically my solution of the latter difficulties will 
follow van Fraassen's de-Ockhamization strategy . . .'. Krips, p. 103) but seems 
not to be concerned with the following implication: For any system S, and any 
non-maximal physical magnitude Q, we can always couple S to another 
system M by a suitable interaction in such a way that the application of the 
principle (Bohr) to the reduced state of S requires the de-Ockhamization of Q to 
a set of magnitude Q’, Q",... Ultimately, considering all possible such 
interactions, all possible systems M, this means that a quantum system S is not 
characterized (ontologically) by the non-commutative algebra of physical 
magnitudes specified by quantum mechanics, but by the non-commutative 
algebra of completely de-Ockhamized magnitudes. If you replace quantum 
mechanics by this theory of de-Ockhamized magnitudes, the game is over. All 
problems of interpretation disappear. But this solution to the problem is no 
solution at all. Clearly, any statistical theory can be de-Ockhamized in this way 
and the possibility of a realist interpretation along these lines has nothing in 
particular to do with quantum mechanics. Krips' proposed realist interpreta- 
tion cannot avoid this slide into triviality. The fact that the interpretation 
involves the acceptance of certain principles and the rejection of others is of no 
consequence. One might as well introduce complete de-Ockhamization"? at 
the outset and be done with it. 

Having come down rather hard on Krips relative to Gibbins and Redhead, I 
do want to emphasize that The Metaphysics of Quantum Theory is a highly 
sophisticated and very competent work. Much of what is negative in this 
review stems from a fundamental difference in style between my own 
approach to these problems and that of Krips. I don't think the Interpretation 
12 By ‘complete’ de-Ockhamization I mean the replacement of every non-maximal magnitude Q by 


a ‘fan’ of non-maximal magnitudes, Q, Q", .. . , one for each maximal magnitude of which Qis a 
function. I distinguish this from a ‘principled’ de-Ockhamization, which might be defensible. 


The Philosophy of Quantum Mechanics 209 


works, the way in which the issues are tackled is, from my standpoint, quite 
perverse, but the book is certainly stimulating in an infuriating sort of way. It ts 
not a book for the uninitiated, but will repay close study by those for whom 
purgatory is a necessary prelude to salvation. 

So where are we now? I think Redhead’s conclusion is the correct one, that 
what we have learnt from the philosophy of quantum mechanics is that ‘some 
sort of action-at-a-distance or (conceptually distinct) nonseparability seems 
built into any reasonable attempt to understand the quantum view of reality' 
(Redhead, p. 169). It is not simply that quantum mechanics describes the 
world in terms of a statistics that cannot be reduced to measures over property 
states in the usual way. Even if we are prepared to settle for a description of the 
state of a physical system as purely statistical, Le. as specified by a quantum 
mechanical state, this state cannot be taken as characterizing—even in a 
statistical sense—what Einstein referred to as the 'being-thus'? of the system. 
It appears that the 'being-thus' of a physical system in a quantum world is 
Inextricably entangled with the 'being-thus' of other systems, whether or not 
these systems are spatially separated, so that even a purely statistical 
description of the world as consisting of separate systems in interaction is 
excluded. As for storles which purport to shed light on this puzzling state of 
affairs, I fear that, so far atleast, the uncertainty principle for the philosophy of 
quantum mechanics applies. 


JEFFREY BUB 
University of Maryland 


13 See (Einstein, [1948]), p. 169 


REFERENCES 


Bex, J. S. [1964]: ‘On the Einstein-Podolsky-Rosen Paradox’, Physics, 1, 195-20. 

Bos, J. [19762]: ‘What is Philosophically Interesting About Quantum Mechanics?’ in R. 
E. Butts and J. Hintikka (eds.), Foundational Problems in the Special Sciences 
(Proceedings of the 5th International Congress on Logic, Methodology and Philosophy of 
Science), pp. 69-79. Dordrecht: Reidel. 

Bus, J. [1976b]: "The Statistics of Non-Boolean Event Structures’, in W. L. Harper and C. 
A. Hooker (eds.), Foundations of Probability Theory, Statistical Inference, and 
Statistical Theories of Science Vol. III (Foundations and Philosophy of Statistical 
Theories in the Physical Sciences), pp. 1-16. Dordrecht: Reidel. 

Bus, J. [1977]: ‘Von Neumann's Projection Postulate as a Probability Conditionaliza- 
tion Rule in Quantum Mechanics’, Journal of Philosophical Logic, 6, 381-90. 
Bus, J. [1981]: 'What Does Quantum Logic Explain?' in E. Beltrametti and B. Van 

Fraassen (eds.): Current Issues in Quantum Logic, pp. 89-100. New York: Plenum. 

Bus, J. [1982]: ‘Quantum Logic, Conditional Probability, and Interference’, Philosophy 
of Science, 49, 402-21. 

Bus, J. and DemMoroutos, W. [1974]: ‘The Interpretation of Quantum Mechanics,’ in R. 





210 Jeffrey Bub 


S. Cohen and M. Wartofsky (eds.), Boston Studies in the Philosophy of Science Vol. 
XIII, pp. 92-122. Dordrecht: Reidel. 

CLAUSER, J. F. and HORNE, M. A. [1974]: ‘Experimental Consequences of Objective Local 
Theories’, Physical Review D10, 526-35. 

Demopoutos, W. [1974]: ‘What is the Logical Interpretation of Quantum Mechanics?’, 
in R. S. Cohen et al (eds.), Philosophy of Science Association Proceedings, pp. 721-8. 
East Lansing, Michigan: Philosophy of Sctence Assoctation. 

DzMoPouLos, W. [1976]: ‘Fundamental Statistical Theories’, in P. Suppes (ed.), Logic and 
Probability in Quantum Mechanics, pp. 421-31. Dordrecht: Reidel. 

DemopouLos, W. [1976]: ‘The Possibility Structure of Physical Systems’, in W. L. Harper 
and C. A. Hooker (eds), Foundations of Probability Theory, Statistical Inference, and 
Statistical Theories of Science Vol. III (Foundations and Philosophy of Statistical 
Theories in the Physical Sclences), pp. 55-80. Dordrecht: Reidel. 

DxzMoPouLos, W. [1977]: ‘Completeness and Realism in Quantum Mechanics', in R. E. 
Butts and J. Hintikka (eds.), Foundational Problems in the Special Sciences (Proceedings 
of the 5th International Congress on Logic, Methodology and Philosophy of Science), pp. 
81-8. Dordrecht: Reidel. 

EINSTEIN, A. ‘Autobiographical Notes’, in Albert Einstein: Philosopher-Scientist. New 
York: Harper and Row. 

EINSTEIN, A. [1971]: ‘Quantum Mechanics and Reality,’ in The Born-Einstein Letters, pp. 
168-173. New York: Walker and Co. Originally published as 'Quanten-Mechan!k 
und Wirklichkeit’, Dialectica, 2, 320-4 [1948]. 

FRIEDMAN, M. and PUTNAM, H. [1978]: ‘Quantum Logic, Conditional Probability and 
Interference’, Dialectica, 32, 305-15. 

GLEASON, A. M. [1957]: Journal of Mathematics and Mechanics, 6, 885. 

Gress, P. [1987]: Particles and Paradoxes: The Limits of Quantum Logic. Cambridge: 
Cambridge University Press. 

Hgvwoop, P. and REDHEAD, M. L. G. [1983]: 'Nonlocality and the Kochen-Specker 
Paradox’, Foundations of Physics, 13, 481-99. 

Hooxzn, C. A. and Hotpswortu, D. [1983]: ‘Critical Survey of Quantum Logic’, in Logic 
in the 20th Century, pp. 127-246. (Special issue of Scientia). 

JARRETT, J. [1984]: ‘On the Physical Significance of the Locality Conditions tn the Bell 
Argument', Nous, 18, 569-89. 

KOCHEN, S. and SPECKER, E. P. [1967]: ‘The Problem of Hidden Variables in Quantum 
Mechanics’, Journal of Mathematics and Mechanics, 17, 59-87. 

Krups, H. [1987]: The Metaphysics of Quantum Theory. Oxford: Clarendon Press. 

Lupzns, G. [1951]: ‘Uber die Zustandsanderung durch den Messprozess', Annalen der 
Physik, 8, 322-8. 

MACZYNSXI, M. J. [1971]: ‘Boolean Properties of Observables in Axiomatic Quantum 
Mechanics’, Reports on Mathematical Physics, 2, 135-50. 

Pron, C. [1976]: Foundations of Quantum Physics. Reading (Mass.): W.A. Benjamin. 

PorPzR, K. R. [1967]: ‘Quantum Mechanics without “The Observer''', in M. Bunge 
(ed.): Quantum Theory and Reality. Berlin: Springer. 

PurNAM, H. [1969]: 'Is Logic Empirical', in R. Cohen and M. Wartofsky (eds.): Boston 
Studies in the Philosophy of Science, pp. 216—241. Dordrecht: Reidel. Reprinted as 
"The Logic of Quantum Mechanics' in Mathematics, Matter and Method: Philosophi- 
cal Papers Vol. 1, pp. 174—97. Cambridge: Cambridge University Press, [1979]. 


The Philosophy of Quantum Mechanics 211 


Putnam, H. [1981]: ‘Quantum Mechanics and the Observer’, Erkenntnis, 16, 193-219. 

RzDuEAD, M. [1987]: Incompleteness, Nonlocality, and Realism: A Prolegomenon to the 
Philosophy of Quantum Mechanics. Oxford: Clarendon Press. 

` Sras, A. [1982]: ‘Quantum Logic and the Luders Rule’, Philosophy of Science, 49, 
422-36. 

Srarrs, A. [19832]: ‘On the Logic of Pairs of Quantum Systems’, Synthese, 56, 47-60. 

Stairs, A. [1983b]: ‘Quantum Logic, Realism and Value-Definiteness', Philosophy of 
Science, 50, 578-602. 

SCHRODINGER, E. [1935]: ‘Die Gegenwartige Situation in der Quantenmechanik', 
Naturwissenschaften, 33, 807-12, 823-8, 844—9. Translated as "The Present 
Situation in Quantum Mechanics: A Translation of Schrodinger's “Cat Paradox” 
Paper’, in J. A. Wheeler and W. H. Zurek (eds.): Quantum Theory and Measurement. 
Princeton: Princeton University Press, [1983]. 

VARADARAJAN, V. S. [1968]: Geometry of Quantum Theory Vol. 1. Princeton: Princeton 
University Press. 

VON NEUMANN, J. [1955]: Mathematical Foundations of Quantum Mechanics. Princeton: 
Princeton University Press. First published as Mathematische Grundlagen der 
Quantenmechanik. Berlin: Springer, [1932]. 


Brit. J. Phil. Scl. 40 (1989), 213-217 Printed in Great Britain 


REVIEW ARTICLE 
Anti-Realism and Logic 


TENNANT, NEIL [1987]: Anti-Realism and Logic. Oxford University Press. 
xli 4- 325 pp. (Hardback. ISBN 0-19-824925-X). 


The main thrust of Tennant's book is to argue the case for his intuitionist 
relevant logic (IRL) as the one true logic. He has a battery of arguments for this 
conclusion which he spreads throughout the book. I imagine that most readers 
of this book will not be overly familiar with the particular format that Tennant 
uses for the exposition of his logic since it requires a knowledge of natural 
deduction in tree form. Indeed for ease of understanding I would suggest that 
intending readers first acquaint themselves with Tennant's earlier work 
Natural Logic since little help is given the reader unfamtliar with the proof- 
theoretic work of Prawitz on which much of Tennant's formal work depends. 

There is a second characterization of IRL (proved by Tennant to be co- 
extensive with the natural deduction formulation), however, which is much 
more accessible since it is formulated via a Gentzen type sequent calculus and 
may be stated simply: take a Gentzen sequent formulation of intuitionist logic 
and remove the cut and thinning structural rules. This formulation does 
indeed deserve the epithet 'elegant' which Tennant claims for IRL but which I 
find hardly applicable to its natural deduction formulation. Aesthetic consider- 
ations aside, there are other reasons for regretting the fact that Tennant directs 
all of his efforts towards the natural deduction formulation rather than the 
sequent formulation. Much of Tennant's book is concerned with the logical 
operators and the desirability of conservatively extending the atomic base when 
introducing the logical operators into a language which does not possess them. 
At no stage in the course of his argument does he indicate why natural 
deduction should be chosen as the medium for the introduction and 
elimination rules of the operators. It is assumed without argument. This ts 
particularly unfortunate if we want to compare the arguments and their 
consequences with those contained in Hacking's 'What is Logic?' where the 
notion of conservative extension plays the major role, but in the context of 
sequent formulations of logic. In Hacking's paper the deducability relation (+) 
is assumed from the outset to satisfy AHA for atomic sentences together with 
the cut and thinning rules. Whenever a logical operator is Introduced with its 
left and right introduction rules Hacking argues that it must not add to the 


214 Review 


existing deducability relation between sentences not containing that operator; 
furthermore, sentences containing it should be shown to satisfy A-A and the 
structural rules of cut and thinning. Hacking’s reasons for insisting on these 
constraints are beyond the scope of this review but Tennant should have taken 
some cognizance of them for his conclusions are directly opposed to those of 
Hacking when the sequent formulation of IRL is compared to it. Firstly the 
structural rules which Hacking takes as paramount and guide for the 
introduction rules are discounted by Tennant altogether: indeed they are seen 
by him as the source of irrelevance for they allow, even within intuitionist 
logic, the proof of A. ~ AFB and A, —AFB. Secondly the propositional 
fragment of logic with which Hacking concludes is classical and thus different 
` from the propositional fragment of Tennant's IRL. Since similar considerations 
are the source of both Tennant's and Hacking's investigations the reader 
would expect some critical account of Hacking's paper. The discussion of 
conservative extension is introduced by Tennant with reference to Dummett's 
writings and several interpretations of Dummett are provided by him. It is 
perhaps because Tennant is so involved with Dummett's anti-realist argu- 
ments that he concentrates on Dummett's remarks about conservative 
additions to language fragments, remarks which are couched in terms of ‘rules 
of inference’. It is a serious omission that Tennant does not explain why these 
rules are interpreted as pertaining to natural deduction rather than sequents. 
The layout of the book is such that it is difficult to perceive which logic it Is 
that Tennant is advocating. Logics are introduced for different purposes at 
various places in the book. It is never clear how these logics are relevant to IRL. 
On page 71 a basic logic is introduced for the purpose of investigating Tarski T- 
sentences. Pages 79-92 introduce a propositional fragment of logic in a 
theoretical reconstruction of the introduction of conjunction, negation, 
conditional and disjunction into an atomic base. Are these logics supposed to 
support IRL? If so it is not clear to me how the latter is supposed to. The possible 
ways of introducing disjunction into a language already containing a 
conditional connective (confusingly represented by Tennant throughout by 
the material implication sign >) and conjunction that Tennant outlines on 
pages 91-2 seem at variance with theorems of IRL. 
Consider the passage: 


What sentence of conditional form with conclusion C would say the same thing 
[as (A2 C) & (B2 C)]? For I wish to draw attention to the fact that the conclusion, 
C, is the same in either case. . . . Is not a plausible answer that I invent a new 
concatenating grunt—call it v —to bind the separate antecedents A and B into 
one, so that the consequent C can be uttered but once, and to the desired pointed 
effect?: - 


(Av B)>C. (p.91) 
This passage with its emphasis on saying the same thing would seem to 


The British Journal for the Philosophy of Science 215 


indicate that Tennant would have to accept that whenever a sentence of the 
form (A2 C) & (B2 C) is a logical truth the corresponding sentence of the form 
(A v B) 2C should also be a logical truth. But this is not so in IRL. A counter- 
example is provided by the pair 


(AA) & ((B &~ B) >A), (Av (B &~B))>A 


where the latter is a logical truth of IRL and the former is not. If, however, the 
account on pages 79-92 are not in fact directly relevant to IRL then the reader 
should have been informed. 

One of the virtues that Tennant claims for IRL is that it avoids the Lewis 
paradoxes of entailment: A, ~ A- B and A. ~ AF B (as well as such analogues as 
BF A» 4). Indeed its classical counterpart CRL (classical sequent logic minus 
cut and thinning) was expressly designed by Tennant to provide just such a 
logic. Among the claims he makes for both IRL and CRL is that these logics 
avoid the Lewis paradoxes, keep disjunctive syllogism and much of transitivity. 
In keeping disjunctive syllogism both IRL and CRL differ from the Anderson- 
Belnap entailment logics which were also designed to avold the Lewis 
paradoxes. As for transitivity, IRL and CRL, shorn of the conditional operator, 
will always give transitivity of entailment when the premises are consistent 
and the consequent not a logical truth. Unfortunately though, as Tennant 
acknowledges, the lack of a conditional connective is an obstacle for the 
reconstruction of an intuitionist mathematics which would seem to require 
some such logical operator and which of course is not definable as material 
implication. One of Tennant's conjectures about IRL is that It is adequate to the 
demands on transitivity of proof normally made in mathematics. This 
conjecture could be sustained by proving suitable metatheorems among which 
would have to be one of the following form: 


If: 

X is intuitionistically consistent, and 

A is not an intuitionist logical truth, and 

X intuitionistically implies A, and 

NON-LEWIS (X, A) 
then: 

X implies A in IRL. 
Here the property NON-LEWIS (X, A) needs to be spelled out. The intuitive 
idea is that it should somehow tease out the fact that It is not the case that 
the only way to prove A from X intuitionistically it to pass through an 
intermediate stage Y-B which violates one of the other antecedent 
conditions in the conjecture. (p. 200) 


Tennant illustrates this with the example ~ PH(P >Q) which can be proved 
only by passing through the state P, ~ PHQ which violates the first condition. 
Doubts about the provability ofthe meta-theorem arise if such simple 





216 Review 


as At~BvA and — Bv AtB2>A which are both theorems of IRL but for 
which transitivity fails—in IRL AF B= A does not hold. If we consider a sequent 
formulation of intuitionist logic there seems to be a way of proving AF(B A) 
without passing through any step which violates any of the first three 
conditions: 


AFA 
A, BFA 
AFB»2.A 


The propositional fragment of CRL itself does compare favourably with the 
more famous of the existing systems designed to escape the Lewis paradoxes, 
however, since it includes those theorems of Anderson and Belnap's first 
degree entailment as well as the two versions that Geach has given of the 
Smiley-Geach-von Wright substitution instance accounts. In fact the rela- 
tions between them for theorems of the form A entails B where A and B are 
sentences containing the operators —, v, & and = are: Anderson- 
Belnapc Geach Ic Geach IIc CRL. That they are proper subsets can be 
illustrated by the following examples: A. (~ A v B)F- B is in Geach I but not 
Anderson-Belnap, (A v B). ~(A v B)F(A. B) is in Geach H but not Geach I and 
finally (A v B). (A v B)F ~(~Av —B)isin CRL but not Geach II. CRL is then 
superior to any of the other devised systems mentioned here in the sense that 
the more classical entailments it captures the better. It does so at a price, 
however, and that price is paid in terms of second-order entailmants. For here 
itis not the case that whenever we have in CRL or (IRL) A v BC then both AFC 
and BHC. This is the case in classical logic and the Anderson and Belnap 
systems (though not in either Geach system). This should be of some 
importance for Tennant as I have indicated above in discussing Tennant's 
introduction of disjunction. 

There is much in this book besides the accumulating arguments for IRL 
which deserve careful reading. Every chapter contains observations and 
remarks that the reader would want to discuss with author. This is at once an 
indication of the book's interest and the author's overly brief discussion of 
controversial issues. To give but one example: Tennant claims that Goldbach's 
conjecture cannot be undecidable in the strong reading that it is impossible to 
prove or refute for (my italics) 


if 'undecidable' had the strong[er] reading I have rejected, one would never be 
able to warrant an assertion to the effect that a sentence is undecidable. For to do 
so would be in part to warrant the assertion that it was impossible to prove, This 
in turn would involve reducing to absurdity the assumption that one could prove 
it. But this would be to disprove it, hence decide it. (p. 119) 


In the case of the Goldbach conjecture it may be true that such a strong reading - 
of undecidable should not be given for here the predicate '—1s the sum of two 


The British Journal for the Philosophy of Science 217 


primes’ is a decidable predicate, and so if Goldbach’s conjecture can be shown 
not to be refutable within formalized Peano arithmetic it would be true. But if 
we were to take, say, the conjecture that there exists of an infinite number of 
Mersenne primes— a conjecture which has the form (x)(Ey)A—4 cannot see 
that a proof of its undecidability in the strong sense is out of the question. It is 
certainly possible that it can be shown to be undecidable within a formalized 
Peano arithmetic. But if such should be the case where else could its 
decidability be sought? 

There are chapters parts of which I found very difficult to understand—I 
failed to absorb the formal parts of Chapter 19, dealing with transitional 
atomic logic for example. Surely, the following needs unravelling: 


an atomic sentence ts both a rule and a derivation. Considered as a derivation, an 
atomic sentence derives Itself by means of the rule that is also itself. (p. 216) 


There are also chapters that are a delight to read both for their intellectual 
content and style—Chapter 15 'Do We Know What We Mean When We Mean 
What We Say?’ is a jeu d'esprit. It is a pity that some of the other chapters could 
not have been polished to reflect more easily the author's thoughts. 

The book is somewhat marred by the number of typographical errors 
including an inelegant variation of Wittgenstein's name. 


A. J. DALE 
University of Hull 


Brit. J. Phil Sci. 40 (1989), 219-222 Printed in Great Britain 


REVIEW ARTICLE 
Mr Keynes on Probability! 


Mr Keynes takes probabilities or probability relations as indefinable, and says 
that if q has to p the probability relation of degree a, then knowledge of p 
justifies rational belief of degree a in q. 

We have, then, numerous probability relations; these it tis commonly 
supposed are all numerical, that is, correlated with the real numbers from 
O to 1 in such a way that the ordinary rules of the probability calculus hold, 
e.g., that the product of the numbers correlated with two probabilities is equal 
to the number correlated with the product (in Mr Keynes' sense) of the two 
probabilities. Mr Keynes denies this; he supposes not only that not all 
probabilities are numerical, but also that it is possible to have two probabllities 
which are unequal and such that neither is greater than the other. This view is 
based on the difficulty in so many cases of saying with any confidence which of 
two probabilities is the greater, or of assigning any numerical measures to 
them. But it would appear that the force of this objection to the ordinary view is 
exaggerated to Mr Keynes for two reasons. 

First, he thinks that between any two non-self-contradictory propositions 
there holds a probability relation (Axiom J), for example between ‘My carpet is 
blue' and 'Napoleon was a great general'; it is easily seen that it leads to 
contradictions to assign the probability 1/2 to such cases, and Mr Keynes 
would conclude that the probability is not numerical. But it would seem that in 


1 This review of J. M. Keynes, A Treatise on Probability (Macmillan, 1921) originally appeared in 
The Cambridge Magazine, Volume XI No. 1 (January 1922), pp. 3-5. It was listed but not 
reprinted in the posthumous collection of Ramsey's work. The Foundations of Mathematics and 
other Logical Essays, edited by Professor R. B. Braithwaite, (Routledge & Kegan Paul, 1931) and 
later re-edited by me as Foundations: Essays in Philosophy, Logic, Mathematics and Economics 
(Routledge & Kegan Paul, 1978). I knew of it, therefore, but had never read It, thinking it would 
have been superseded by Ramsey's comments on Keynes’ theory tn his ‘Truth and probability’ 
(Foundations, ch. 3), until Professor Braithwaite gave me his copy of it last week. I then realized 
that it contained important comments not contained in ‘Truth and probability’, and agreed with 
Professor T. J. Smlley's suggestion that it should be republished. In preparing it for the printers I 
have corrected some clear typing errors, altered the quotation marks to conform to modern 
conventions, and removed some inconsistencies tn punctuation and the capitalization of initial 
letters. I have also, for the benefit of Keynes’ readers, restored his symbolism, which Ramsey 
varied: except that I have used the tilde instead of an overbar for negation. For those unfamiliar 
with it, ‘¢(a)/S(a)-h’, for example, means, in what is now a more common notation, p(@a/Sa&h), 
that is, the probability that a is ¢, given h and that a ts S. I should also add that ‘g(¢,)’ is how 
Keynes writes that all $s are fs.—D. H. Mellor, Darwin College, Cambridge. 


220 D. H. Mellor 


such cases there is no probability; that, for a logical relation, other than a truth 
function, to hold between two propositions, there must be some connection 
between them. If this be so, there is no such probability as the probability that 
‘my carpet is blue’ given only that ‘Napoleon was a great general’, and there is 
therefore no question of assigning a numerical value. 

Secondly, it is surely obvious that probabilities may be numerical or 
comparable without our being able to assign their numerical values or 
compare them, owing to the imperfection of our logical insight. 

Thus a probability may, as Mr Keynes admits, be unknown to us through 
lack of skill in arguing from given evidence. But he says ‘This admission must 
not be allowed to carry us too far. Probability is relative in a sense to the 
principles of human reason. The degree of probability which it is rational for us 
to entertain, does not presume perfect logical insight, and is relative in part to 
the secondary propositions we in fact know. . . . If we do not take this view of 
probability, if we do not limit it in this way and make it, to this extent, relative 
to human powers we are altogether adrift in the unknown; for we cannot ever 
know what degree of probability would be justified by the perception of logical 
relations which we are, and must always be, incapable of comprehending.’ 

But we are concerned with the relation which actually holds between two 
propositions; the faculty of perceiving this relation, accurately or otherwise, 
we call insight, perfect or imperfect. Mr Keynes argues that owing to the 
possibility that our insight may be all wrong we should talk not of the relation 
which actually holds, but of the relations which, we have reason to suppose, 
holds. Then, he thinks, we could speak without fear of unknown factors. There 
seems, however, no good reason to confine this argument to probability. In 
everything, it might be urged, owing to the possibility that there is evidence to 
which we have no access, we are only justified in saying not 'p' but ‘We have 
reason to suppose p'. The logical conclusion of this view is that we are not 
justifled in saying anything at all; for our evidence about human reason might 
also be fragmentary. We cannot therefore reasonably say ‘We have reason to 
suppose the probability is a’, but only ‘We have reason to suppose that we have 
reason to suppose the probability ts a’, and so on ad Infinitum—on the lines of a 
celebrated argument in Dr Moore's Ethics. 

Mr Keynes is like a surveyor, who, afraid that his estimates of the heights of 
mountains might be erroneous, decided that were he to talk about actual 
heights he would be altogether adrift in the unknown; so he said that heights 
were relative to surveyors' Instruments, and when he came to a mountain 
hidden in mist he assigned it a non-numerical height because he could not see 
if it were taller or shorter than the others. 

After dealing with the measurement of probabilities, Mr Keynes proceeds to 
consider the Principle of Indifference, which he shows to lead, if stated in its 
usual form, to various contradictions. He proposes to remedy this by stating 
precise conditions for the validity of the Principle. He does not, however, seem 


Mr Keynes on Probability 221 


to have done this successfully. At the bottom of p. 62, he says, ‘Suppose that a 
point lies on a line of length ml, we may write the alternative ‘‘the interval of 
length ! on which the point lies is the xth . . . from left to right" = @(x); and the 
Principle of Indifference can then be applied safely to the m alternatives $(1), 
(2)... (m)’ and clearly this case does fall under his conditions; and so then 
does the analogous case in which we know that the density of a substance lies 
between 1 and 3; we can then take the ‘interval of length 1 in which the 
density lies is the xth from left to right’ = (x) and apply the Principle to ¢(1), 
$(2), concluding that the density is equally likely to lie in the intervals 1—2 
and 2—3; ifnow we apply this argument also to the specific volume which we 
know to be between 1 and 1/3, since the density lies between 1 and 3, we find 
that on the same data the specific volume is equally likely to lie in the Intervals 
1—2/3, 2/3—1/3 and therefore the density in the intervals 1—3/2, 3/2—3, 
which contradicts the result previously obtained. This contradiction is pointed 
out by Mr Keynes, p. 45, but he seems not to have noticed that it escapes his 
safeguards. 

The true solution of the difficulty seems to depend on Mr Johnson's notion 
"The Determinable'. The Principle of Indifference may be stated as follows: 
Relative to evidence, on which it is certain, that a given subject has one or 
other of a finite number of absolute determinates under the same determin- 
able, the probabilities that the subject has each of those absolute determinates 
are equal, provided that the evidence is symmetrical with regard to the various 
alternatives. 

The Principle, so qualifled, can be applied to dice, coins and cards, but not to 
such cases as the position of a point on a line, in which the number of possible 
absolute determinates (e.g., points on the line) is infinite. It appears that no 
principle can be given for cases of this second kind which would not lead to a 
contradiction like that ofthe volume and the density. The natural concluston is 
that in such cases there is no probability; i.e., that there is no logical relation 
between premiss and conclusion. 

In Part II, Mr Keynes gives a symbolic deduction of the formulae of the 
calculus of probabilities from definitions and axioms; this has a minor flaw. Mr 
Keynes conceals two important axioms in deflnitions; defining the sum of ab/h, 
a~b/has a/h and the product of a/bh, b/h as ab/h, he conceals the assumptions 
that the sum and product so defined are always unique, i.e., that if ab/h — cd/k, 
(=P), a~b/h=c~d/k, (=Q) then a/h=c/k, (=PQ); and that if a/bh —c/dk, 
(=P), b/h=d/k, (=Q) then ab/h=cd/k, (=PQ). 

Mr Keynes’ treatment of induction seems to be vitiated by the fact that he 
only considers the Method of Agreement, completely neglecting Mill's other 
four methods including, for example, the Method of Difference, which consists 
in inferring g(@,f), not from numerous cases, otherwise as varted as possible, 
agreeing in having ¢f, but from sets of two cases, in other respects analogous, 
one having óff, the other not $, not f. 


222 D. H. Mellor 


Mr Keynes concludes that induction is only rational if there is a finite a priori 
probability in favour of what he calls the Hypothesis of Limited Independent 
Variety; ie., that all properties arise out of a finite number of generator 
properties. If this is to be taken literally, i.e., ‘property’ interpreted in the wide 
sense = propositional function of one variable, it is clearly equivalent to the 
hypothesis that the classes of things of the type considered are finite in number, 
since equivalent properties define the same class and on the hypothesis any 
property is equivalent to one of a finite number of properties (i.e., the generator 
properties and negations conjunctions and alternations of them). And this 
hypothesis that the classes of things are finite in number, is in turn equivalent 
to the hypothesis that the things are finite In number, since, if n be the number 
of things, 2” ts the number of classes of things; so that the Hypothesis of Limited 
Variety is simply equivalent to the contradictory of the Axiom of Infinity. 

Lastly we may note that Mr Keynes’ definition of ‘random’ suggests that he 
may be wrong in his fundamental conception of probability. For in it occurs the 
probability $(x)/S(x)-h; and it is considered whether this is equal to $(x)/ 
S(x): h-x -a- $(a)/S(a): h. 

Now in $(x)/S(x)- h, x is a variable. $(x), S(x) are not propositions at all but 
propositional functions. We have therefore a new kind of probability, a relation 
between two propositional functions, $(x), S(x) and a proposition h; a kind 
which cannot possibly be reduced to the ordinary kind (a relation between two 
propositions). But the converse reduction (except on Mr Wittgenstein's view of 
identity) is always possible, e.g., ó(a)/S(a)-h— $(x)/S(x):x—a:h. We have, 
therefore, two possibilities; either there are two kinds of probability relations, 
two termed relations between propositions, and three termed relations 
between two propositional functions and a proposition; or all probability 
relations are of the latter more complicated kind. 


F. P. RAMSEY 


Brit. J. Phil. Sci. 40 (1989), 223-227 Printed in Great Britain 


DISCUSSION 


Not Very Likely: A Reply to Ramsey 


Keynes is concernéd with the relation which exists between one set of 
propositions (our knowledge) and another set (our hypotheses). (See e.g. 1-2- 
1:7, 1:9).! This epistemic probability concept clearly differs from the concept of 
physical probability. A coin’s bias for heads exists independently of anyone’s 
knowledge, and it is plausible to suppose that such bias can always be 
measured numerically. If, however, our concern is not physical probability, 
but rather, the probability of heads given our present knowledge, then the 
possibility of numerical measurement is less obvious. But it is obvious that 
epistemic probabilities can sometimes be compared. Let p and q be elements of 
our knowledge corpus K: then p and q have equal and maximal K-probability 
(probability relative to K). Similarly, if K includes both —p and —q, then p and q 
have equal and minimal K-probability. Furthermore, equiprobability does not 
have to occur at the epistemic extremes. If my knowledge corpus K includes the 
proposition that there exist two fair coins c1 and c2, then the hypothesis that 
cl will land heads and the hypothesis that c2 will land heads are 
K-equiprobable. I could not reasonably accept a betting rate for the first 
proposition which I was not prepared to accept for the second. 

These two hypotheses have a K-probability which is equal and intermediate; 
i.e. less than the maximum and greater than the minimum. Clearly then, we 
can apply terms such as ‘equal’, ‘greater than’, ‘less than’ to epistemic 
probabilities, without presupposing that such probabilities are numerical. For 
brevity I shall use the symbols '*—', ‘*>’, ‘*<’ when comparing epistemic 
probabilities. (The asterisk serves as a reminder that we have not yet 
established whether epistemic probabilities are numerical.) Thus for example 
'P(p/K) * « P(q/K)' means 'p has a smaller K-probability than q has’. 

Ramsey criticizes Keynes’ contention that epistemic probabilities are not 
always numerical, comparing him to 


a surveyor who, afraid that his estimates of the heights of mountains might be 
erroneous ... said that heights were relative to surveyors’ instruments, and 
when he came to a mountain hidden in mist he assigned it a non-numerical 
height because he could not see if it were taller or shorter than the others (7). 


1 Section references are to Keynes’ Treatise. Later, paragraph numbers will refer to Ramsey's 
review as reprinted above. 


224 D. E. Watt 


This is excellent rhetoric; it is only the philosophy which is poor. In the first 
place, Keynes does not use the possibility of erroneous probability assignments 
to prove either that probability is relative to evidence, or that probability is 
sometimes non-numerical. The passage (in 3.12) quoted by Ramsey in 
paragraph 5 makes no reference whatever to the possibility of having mistaken 
bellefs about probability. Rather, it makes probability relative to our (limited) 
knowledge, and knowledge obviously does not include our mistaken bellefs. In 
the second place, as regards the numerical nature of epistemic? probability, 
Ramsey's analogy begs the question. It is obvious that mountains have 
numerical heights; it is not at all obvious that probability is numerical. 

In chapter 3, Keynes cites various cases in which, he claims, probability is 
not numerical. Ramsey prefers to analyse these as cases in which no 
probability relation exists (2.3). Given a particular knowledge set K and 
proposition p, Ramsey seems justifiably more cautious than Keynes as regards 
accepting the existence of any probability relation, numerical or otherwise. For 
instance, let K be the knowledge set of a newly born baby, and let p be 
Goldbach's conjecture. Is it clear that P(p/K) exists? If it does, this presumably 
has implications for the way a reasonable agent will regard the proposition. 
Butif K is the knowledge of a newly born baby, how should the agent regard p? 
The answer, of course, is that the agent does not regard p at all, so the question 
of how he should do so does not arise. In such a case it seems idle to speak of 
probability. We shall say that p is ‘not a hypothesis relative to K'; ‘not a 
K-hypothesis'. Thus in what follows the term 'hypothesis' will be reserved for 
those cases were some probability exists—maximal, minimal, or intermediate. 

In particular cases, then, Ramsey's caution in regard to the existence of a 
probability relation seems justified. Does this undermine Keynes' arguments 
that probability is not always numerical? In 3.4 he states that a bookmaker 


may be almost certain . . . that there will not be new taxes on more than one of 
the articles tea, sugar, and whisky; there may be an opinion abroad, reasonable 
or unreasonable, that the likelihood is in the order—whisky, tea, sugar; and he 
may, therefore, be able to effect insurances for equal amounts in each at 30 per 
cent, 40 per cent, and 45 per cent. He has thus made sure of a profit of 15 per 
cent, however absurd and arbitrary his quotations may be. It is not necessary for 
the success of underwriting on these lines that the probabilities of these new 
taxes are really measurable by the figures 3/10, 4/10, and 45/100; it is sufficient 
that there should be merchants willing to insure at these rates. These merchants, 
moreoever, may be wise to insure even if the quotations are partly arbitrary; for 
they may run the risk of insolvency unless their possible loss is thus limited. That 
the transaction is in principle one of bookmaking is shown by the fact that, if 
there is a specially large demand for Insurance against one of the possibilities, the 
rate rises;—the probability has not changed, but the ‘book’ ts in danger of being 
upset. 


? This word will be omitted In what follows, since epistemic probability is our present concern. 


Not Very Likely: A Reply to Ramsey 225 


Keynes would claim, then, that if K is the bookmaker's knowledge, and pis the 
proposition that there will be a new tax on whisky, there is no reason for 
supposing that P(p/K) is measurable by the fraction 3/10 or, indeed, that it is 
numerically measurable at all. Can we answer Keynes by denying that P(p/K) 
exists? Hardly. For if P(p/K) did not exist there would be no constraints on the 
way in which an agent, with just K as his knowledge, could reasonably regard 
p. Clearly however, such constraints exist. For instance, the bookmaker cannot 
reasonably be sure of p, or sure of — p; thus P(p/K) is less than the maximal 
probability and greater than the minimum probability. However, a numerical 
value for P(p/K) seems highly dubious. 

It might be suggested that P(p/K) can be evaluated numerically by one of the 
available techniques of forcing the reasonable agent to bet. However, this 
suggestion, far from proving that P(p/K) is numerical, seems to presuppose this. 
Only if we can be sure in advance that P(p/K) Is numerical, can we be sure that 
there will be a single betting rate, appropriate to that value, which the 
reasonable agent must choose. If, however, there is no good reason for 
supposing that P(p/K) is numerical, there is no good reason for supposing that 
the reasonable agent is restricted to one betting rate. Perhaps there is a whole 
range of betting rates which he can choose. 

In contrast to Ramsey, I suggest that Keynes' examples are evidence for the 
conclusion that probability is not always numerical, and indeed for the 
stronger conclusion (also accepted by Keynes: see 3.8) that probabilities are 
sometimes not even comparable; that we can have knowledge K, and 
K-hypotheses H1 and H2, such that P(H1/K) is neither greater than, nor equal 
to, nor less than P(H2/K). I shall now offer a proof of this stronger conclusion 
by adapting an example from 4.6; intended by Keynes for a different purpose. 

Define the specific volume of a substance as the density of water, divided by 
the density of that substance, and define the specific density of a substance as 
the density of that substance, divided by the density of water. Let our 
knowledge be K, and suppose there is a substance S regarding which the 
following four propositions are K-hypotheses: 


H1: The specific volume of S lies between 1 and 2, 
H2: The specific volume of S lies between 2 and 3, 

' H3: The specific density of S lies between 1/3 and 2/3, 
H4: The specific density of S lies between 2/3 and 1. 


In addition, suppose we know that either H1 or H2 is true, and also that either 
H3 or H4 1s true. However, we have no more exact information as to the 
whereabouts of the specific volume or the specific density. Thus our knowledge 
does not favour H1 over H2 or vice-versa; similarly, it does not favour H3 over 
H4 or vice-versa. Symbolically, 


(a) —[P(H1/K) *» P(H2/K)] & — [P(H1/K) * « P(H2/K)]. 
(b) —[P(H3/K) *» P(H4/K)] & —[P(H3/K) * « P(H4/K)]. 


226 D. E. Watt 


In addition, suppose that because K contains the following propositions: 


H1 1s logically weaker than H4, 
H2 1s logically stronger than H3, 


we have: 


(1) P(H1/K) *» P(H4/K), 
(2) P(H2/K) * « P(H3/K). 


We are arguing against the following doctrine: 
C: All probabilities are comparable, 


and the proof proceeds by reductio. Assume, therefore, that C is true. Then 
given (a), we have: 


(3) P(H1/K) *— P(H2/K), 
(1) and (3) entail that 

(4): P(H2/K) *» P(H4/K), 
(4) and (2) entail that 

(5): P(H3/K) *» P(H4/K), 


and (5) contradicts the first conjunct of (b). 

Thus the doctrine that all probabilities are comparable has been refuted by 
describing a knowledge set K, and K-hypotheses H1 and H2, such that 
although H1 ts neither more nor less K-probable than H2, a contradiction 
arises from the assumption that H1 and H2 are K-equiprobable. So P(H1/K) 
and P(H2/K) are incomparable. A similar argument shows that P(H3/K) and 
P(H4/K) too are incomparable. 

It might fairly be asked: How does reasonable action as regards incompar- 
able hypotheses differ from reasonable action in regard to equiprobable 
hypotheses? I offer the following suggestion. Suppose that an agent with 
knowledge set K is betting on K-equiprobable hypotheses h and h'. Then he 
need not be limited to a single betting rate—there may be a whole range of 
rates he can reasonably accept. Nonetheless, whatever rate he accepts on one 
hypothesis he should be prepared to accept on the other.’ But if h and h’ are 
incomparable, there are fewer constraints on our agent. He can now 
reasonably accept a betting rate for h which he refuses to accept for h', provided 
that he does not do so from the belief that one hypothesis is more probable than 
the other. 


3 Here! am, of course, abstracting from the content of h and h’: obviously the case may be altered if 
his ‘Betting is not worthwhile’. There are various techniques for dealing with this problem; e.g. 
we can get the agent to bet on the truth of h and h’ without telling him what they are, but only 
that they are equiprobable relative to his knowledge. 


Not Very Likely: A Reply to Ramsey 227 


To illustrate this suggestion, let us revert to our previous example, but with 
slightly different K-hypotheses: 


H1: The specific volume of S lies between 1 and 2, 

H2: The specific volume of S lies between 2 and 3, 
H1’: The specific density of S lies between 1/2 and 1, 
H2': The specific density of S lies between 1/3 and 1/2. 


Suppose the agent sees that H1 and H1' are logically equivalent, as are H2 and 
H2". He also sees that the interval between 1/2 and 1 is three times as large as 
the interval between 1/3 and 1/2. He might then capriciously choose a betting 
quotient for H1 which is three times that of H2. He would be perfectly 
reasonable in so doing, provided he was prepared to acknowledge that he 
could just as reasonably have favoured H2 instead. In fact, once he sees that 
the interval between 2 and 3 can be isomorphically mapped into an arbitrarily 
small interval, his betting can favour H1 to an arbitrary extent, provided he 
never treats it as a certainty. 


D. E. WATT 
Campion College 
Melbourne 


Brit. J. Phil. Sci. 40 (1989), 229-231 Printed in Great Britain 


DISCUSSION 


Can a Theory-Laden Observation 
Test the Theory? 


One of the interesting questions in exploring the complex interaction between 
experiment and theory is that of the theory-ladenness of observation. In its 
most radical form, incommensurability, Kuhn and Feyerabend have argued 
that experiment cannot distinguish between competing paradigms or theories. 
Briefly stated, the argument is as follows. There can be no neutral observation 
language. all observation terms are theory laden and thus we cannot compare 
experimental results because in different paradigms the terms describing those 
experimental results have different meanings, even when the words used are 
the same. An example would be the term 'mass' which in Newtonian 
mechanics is a constant, while in Einstein's special theory of relativity it 
depends on the velocity of the object. It has already been argued that in this 
particular instance, the change from Newtonian to Einsteinian mechanics, a 
prime example for both Kuhn and Feyerabend, that a procedurally defined, 
theory neutral (between the two competing theories) experiment can 
distinguish between the theories (Franklin [1984]). 

There are even circumstances under which the theory-ladenness of an 
experimental result can be a virtue. In their argument that different 
experiments provide more confirmation of an hypothesis than repetitions of 
the same experiment, Franklin and Howson [1984] pointed out that the 
existing theoretical context may provide reasons why experiments will be 
considered different. Thus, tests ofthe velocity addition law at speeds close to or 
small compared to the speed of light would be considered almost the same 
before 1905, when Newtonian mechanics, which made no such distinction, 
was the only theory. After 1905, when Einsteinian relativity became a serious 
competitor, such experiments would have been considered quite different. At 
the very least, observation or experiment will be laden with the theory of the 
apparatus. Dudley Shapere has extended the idea of 'direct observation' to 
include theoretical beliefs. In his discussion of the solar neutrino experiment he 
states, 'X is directly observed if (1) information is received by an appropriate 
detector and (2) that information is transmitted directly, Le. without 
interference, to the receptor from the entity X (which 1s the source of the 
information)’ (Shapere [1982], p. 492). The dependence on theory is clear. 


230 A. Franklin et al. 


Theory tells us what an appropriate receptor is and that the information is 
transmitted without interference. 

Nevertheless, if the theory of the experimental apparatus and the theory of 
the phenomena under test are distinct no obvious problems arise for the testing 
of the theory of the phenomena. Thus, in testing whether or not parity was 
conserved in the weak interactions the experimenters could, and did, use an 
apparatus whose proper operation depended only on the strong and electro- 
magnetic interactions, in which parity conservation was already well 
established. 

Amore difficult problem arises when the apparatus, or part of the apparatus, 
depends for its proper operation on the theory of the phenomena under test. 
There would seem to be, at first glance, a vicious circularity if one were to use a 
mercury thermometer to measure the temperature of objects as part of an 
experiment to test whether or not objects expand as thelr temperature 
increases (For an interesting and relevant discussion see Gillies [1972]). Such 
a thermometer depends on that same hypothesis for its proper operation. We 
believe, however, that one could use a mercury thermometer In such a test. All 
that would be required would be the ability to calibrate such a thermometer 
against another thermometer whose operation depends on a different theory. 
For example, we could calibrate the mercury thermometer against a constant 
volume gas thermometer, whose proper operation depends on the proportion- 
ality of the pressure of a gas to its absolute temperature. In such a case one can 
disregard the theory of the mercury thermometer and use it merely as a 
calibrated temperature measuring device. Calibration, in this instance, serves 
to transfer the theory of the apparatus from one theory to another, thus 
separating the theory ofthe apparatus from the theory under test. It would also 
be quite odd if an experimenter who did not know the theory of the mercury 
thermometer could use it in a test of the hypothesis, while one who did know 
the theory was barred from doing so. 

In fact, one could even use a thermometer which, at first glance, seems to 
refute the hypothesis and still test that hypothesis. A mercury thermometer 
actually depends on the differential thermal expansion of the mercury and the 
glass container. Suppose one were to use a container that had a higher 
coefficient of thermal expansion than the mercury. When such a thermometer 
is heated the column of mercury goes down, refuting, at least for a very naive 
observer, the hypothesis. Despite this, one can use the thermometer, if properly 
calibrated, to measure the temperature of other objects by means of the fall of 


! Thecareful reader may note that the calibration of such a mercury thermometer is itself a test of 
the hypothesis, but that is irrelevant. The experimenter need not know of the hypothesis, or care 
about it, when the calibration ts performed. It would also be quite odd If the intended use of an 
Instrument had a serious effect on its calibration and in the belief in its reliability. This is not, of 
course, to deny that the precision and accuracy of an Instrument is important for testing a 
particular hypothesis. If the instrument ts not precise enough, or accurate enough, the test may 
be inconclusive, but that does not affect the calibration of or the reliability of the instrument. 


Can a Theory-Laden Observation Test the Theory 231 


the column, and thus test the hypothesis. Further reflection would, of course, 
demonstrate that the thermometer does not, in fact, refute the hypothesis. 

We do not wish to imply that there are no possible cases in which the theory- 
ladenness of observation prevents the testing of a theory,” but we believe that 
those who believe such tests are impossible should present a workable example 
from actual sclence. At the moment we know of no such cases. 


A. FRANKLIN, M. ANDERSON, D. BROCK, S. COLEMAN, 
J. DOWNING, A. GRUVANDER, J. LILLY, J. NEAL, 

D. PETERSON, M. PRICE, R. RICE, L. SMITH, 

S. SPEIRER and D. TOERING 

University of Colorado 


? Wenote that our discussion does not, of course, solve the Duhem-Quine problem. A determined 
sceptic could always maintain that the experimental result is wrong and hence does not test the 
theory. 


REFERENCES 


FRANKLIN, A. [1984]: ‘Are Paradigms Incommensurable?’ British Journal for the 
Philosophy of Science, 35, pp. 57-60. 

FRANKLIN, A. and Howson, C. [1984]: ‘Why Do Scientists Prefer to Vary Their 
Experiments?’ Studies in History and Philosophy of Science, 15, PP. 51-62. 

Gruss, D. [1972]: 'Operationalism', Synthese, 25, pp. 1-24. 

SHaverE, D. [1982]: ‘The Concept of Observation in Science and Philosophy’, 
Philosophy of Science, 49, pp. 485—525. 


Brit. J. Phil. Sci. 40 (1989), 233-247 Printed in Great Britain 


DISCUSSION 


Quine on Theory and Language* 


1 Social Character of Language 

2 Sameness of Language 

3 Analytic Sentences and Dynamics of Language 
4 Observation Sentences 

5 Learnability of Language 


I SOCIAL CHARACTER OF LANGUAGE a 


Quine writes ‘Language is a social art’ at the beginning of the Preface to Word 
and Object. The social character of language has been emphasized at various 
places in his writings since Word and Object. As Dummett points out (Dummett 
[1978], pp. 382 f.), this aspect of the book (and other later writings) differs 
remarkably from “Two Dogmas of Empiricism’. 

In ‘Two Dogmas’, the emphasis was on the general revisability of our 
theories. When we encounter a recalcitrant experience, according to the 
Quine in "Two Dogmas’, we can accommodate it ‘by any of various alternative 
reévaluations in varlous alternative quarters of the total system’ (Quine 
[1951], p. 44). Moreover, in ‘Two Dogmas’, it was supposed that each 
individual could choose in his own way how to revise the system. However, 
once the social character of language is recognized and ‘the fixity needed in 
communication’ (Quine [1960], p. 56) is acknowledged, one can no longer 
maintain any of these theses. Quine says that the criterion of membership in 
the same speech community (or of sameness of language) is ‘general fluency of 
dialogue’ (Quine [1969], p. 87 and n. 7 on p. 88. See also Quine [1974], pp. 39 
& 84). A drastic and wholesale revision admitted in ‘Two Dogmas’ will destroy 
the ‘fluency of dialogue’, and so cannot be taken as a revision of theories within 
the same language. And if each of us revises the system in his own particular 
way quite different from each other, all dialogue will become impossible. The 
social character of language requires some constraints on general revisability. 
It is true that, already in “Two Dogmas’, Quine was considering a constraint on 
the revisability, that is, ‘our natural tendency to disturb the total system as 


* This is an abridged version of the paper originally written under the same title for an internal 
publication of Tokyo Metropolitan University. 


234 Nobuharu Tanji 


little as possible’ (Quine [1951], p. 44), in other words, our ‘conservatism’ 
(ibid., p. 46). But it is obvious that, in ‘Two Dogmas’, Quine regarded this 
‘natural tendency’ as a useful but merely accidental matter which has little to 
do with the basic conditions of the possibility of language, and which is, 
therefore, semantically irrelevant. 

On the Word and Object view, however, our natural tendency toward 
conservatism plays an important role for the possibility of language. In terms of 
‘speech dispositions’, to learn a language is to acquire the dispositions to verbal 
behaviour common to the members of the linguistic community. And an 
arbitrary and large scale revision will make one’s speech dispositions so 
different from those of others that the general fluency of dialogue will be 
destroyed, and that one will lose one’s membership in the community. 
Preserving the body of common speech dispositions is a semantic requirement, 
Le., what is required for the possibility of language. An effect of recognizing the 
social character of language is recognizing ‘conservatism’ as a semantically 
relevant constraint on the general revisability. 

This constraint on revisability, in turn, has the consequence that we cannot 
separate the understanding of language from the acceptance of those beliefs 
which are shared by almost all members of the linguistic community, because 
acquiring the verbal dispositions common to almost all members of the 
community Includes acquiring the dispositions to assent unconditionally to 
the sentences which constitute the shared beliefs. Our present body of beliefs 
has a privileged status in relation to our present language. We can talk fluently 
about almost nothing with those who do not have ‘the immemorial doctrine of 
ordinary enduring middle-sized physical objects’ (Quine [1960], p. 11). If we 
want to talk fluently about the sub-atomic structure of the world, or Greek 
literature, we must, first of all, learn advanced theorles of physics, or acquire 
sufficient knowledge of Greek literature. In this sense, we cannot separate 
clearly between the understanding of language and substantial belief or 
knowledge as far as the latter is generally shared. Of those sentences which 
constitute the socially shared beliefs and knowledge, we can say with 
Wittgenstein that 'the truth of my statements is the test of my understanding of 
these statements,’ and that ‘if I make certain false statements, it becomes 
uncertain whether I understand them’ (Wittgenstein, 80-1). 

Then, however, there may occur a suspicion such as the following. We can 
change, and actually have changed, the system of beliefs generally held by our 
linguistic community. And even a very trifling change in our common beliefs 
must be counted as a change in the set of speech dispositions common to 
almost all members of the linguistic community. Therefore, whenever we 
change even a very small part of our common beliefs, must we not admit that a 
linguistic change happens? (Cf. Dummett [1978], p. 412) This consequence 
would be not only implausible but also fatal to rational considerations and 
arguments. For it seems to me that any consideration or argument must be 


Quine on Theory and Language 235 


done within the same language if it is to be called a consideration or argument at 
all, and changes in our common beliefs typically take place as results of 
considerations and arguments. To reply to this objection, we must consider the 
criterion of the sameness of language. 


2 SAMENESS OF LANGUAGE 


The objection mentioned above is based upon a very natural and literal 
reading of Word and Object. For Quine writes frequently as if he were completely 
identifying a language with a set of speech dispositions. He speaks of ‘language 
as the complex of present dispositions to verbal behavior’ (Quine [1960], p. 
27), ‘language as a soctally inculcated set of dispositions’ (ibid., p. 45), etc. I 
think, however, that Quine can reply to this objection by saying that, although 
a language is seen as consisting of a set of dispositions, the criterion of the 
sameness of language is not the sameness of the set of dispositions but ‘general 
fluency of dialogue’ as declared in ‘Epistemology Naturalized’ (Quine [1969], 
pp. 87 f.) and in The Roots of Reference. (An animal consists of a set of atoms, but 
the criterion of the sameness of an animal is not the sameness of the set of 
atoms.) And so, the reply continues, we need not regard the abandonment of 
one stimulus-analytic sentence as a change in the language insofar as general 
fluency of dialogue is preserved. We can say that small changes in the totality 
- ofspeech dispositions common to the linguistic community may be thought of 
as conforming to other large parts of it, and that a discourse that is only slightly 
deviant from the common dispositions can be seen as within the same 
language as far as ‘general fluency’ is not violated. 

It may, however, be objected that what this move can save are only slight 
changes in our beliefs or theories. Can the great changes in our beliefs which 
have happened through the long history of our community or which were 
brought about by the scientific revolutions be dealt with by this manoeuvre? 
Because these great changes involve changes in too many dispositions to 
preserve general fluency of dialogue, must we not admit some changes in 
language to which we cannot apply this manoeuvre? Because the thesis that 
understanding a language and accepting the common beliefs are inseparable 
(to this, now, I should add this qualification: ‘beyond certain permissible 
fluctuations’) seems to imply that changes in the common beliefs (beyond 
certain permissible fluctuations) are not separable from changes in language, 
this objection seems to be natural. 

I think, however, we can reply to this objection, and, in the course of the 
reply, an essential feature of language will emerge. The first step of my reply is 
to take seriously Quine's criterion of the sameness of language, i.e., the general 
fluency of dialogue. This is undoubtedly a very vague criterion, and Quine 
himself does not make anything of it. But I think that this criterion captures a 
very important feature of language, i.e., that the sameness of language does not 


236 Nobuharu Tanji 


satisfy the transitivity law. If there are only slight differences between the 
speech dispositions common to almost all speakers of Lı and those of Lz, the 
speakers of these languages will be able to communicate so fluently with each 
other that L1 and I, can be taken as the same language, according to this 
criterlon. But, of course, cumulative slight differences can make a vast 
difference. Even if general fluency holds between L and L;, and between Lz and 
I5, and so on, it nevertheless need not hold, in general, between L, and Ln. 

It is right to say that, if the transitivity law does not hold, what you have ts, 
strictly speaking, not a notion of sameness or identity. But, I think, we must 
accept the non-transitivity of, as it were, the quasi-sameness of language as 
one of the essential characteristics of language. In other words, my suggestion 
is that we ought to replace the strict notion of diachronic identity of language, 
which does satisfy the transitivity law, by that of a chain of stages of socially 
shared linguistic dispositions, connected only by the relation of 'quasi- 
sameness'.! 

The most important consequence of the non-transitivity of the quasi- 
sameness of language is that the very notion ofa ‘change of language’ becomes 
not as straightforward as it might be if it were transitive. We must distinguish 
two points of view from which we speak of changes of language and theories. 
Suppose that there is actually a chain of stable stages of a language, between 
each neighbouring two of which we have no obstacle in communication, and 
that that chain connects two considerably different terminal stages. One point 
of view that I shall call the ‘internal point of view’ is that from which we trace 
out the chain stage by stage. From this point of view, we can speak of 
successive changes of beliefs (or theories) within the same language, because, 
tracing out the chain stage by stage, we do not find any discontinuity, any 
breakdown of fluency of dialogue. Of every step, we can say that the change of 
belief happens within the (quasi-)same language. The other point of view, the 
‘external point of view’, is that from which we compare the two terminal stages 
directly. From this point of view, we should say that a change of language has 
occurred, because there is a large discrepancy between the common 
dispositions at the initial stage and at the final stage. 

What is important about the distinction between the two points of view is 
that, even where we should say that there has been a considerable change in 
the language from the external point of view, we need not assume, from the 
internal point of view, that there must be a certain point at which a change of 
language really happened, and therefore to which we must abandon any hope 
of applying a semantic constraint. I think that it is an unjustifiable 
presupposition to assume that a change of language must happen at a certain 


1 The remark that ‘any consideration or argument must be done within the same language’ in the 
previous section should be understood as '.. . within the quasi-same language’. Similarly, a 
‘change’ of language between two stages should be construed as not holding of the quasi- 
sameness relation between them. 


Quine on Theory and Language 237 


definite point. And another presupposition, namely, that the (so-called) 
sameness of language does satisfy the transitivity law undoubtedly underlies it. 
It seems to me that these presuppositions are responsible for many difficulties 
concerning the ‘paradox of meaning variance’, or the question, ‘Change of 
theory or change of language?’ Once we have recognized the non-transitivity 
of the quasi-sameness of language and the resulting two points of view, we can 
instead say: ‘Change of theory and change of language’. Or, to state this 
somewhat paradoxically, theories change at any time within the quasi-same 
language, and during the same period the language also changes. 

Now, the answer to the objection above, at least as far as great but long term 
changes in our beliefs are concerned, should be clear. I admit that great 
changes in our language have really taken place as results of the changes tn 
our beliefs, seen from the external point of view. But this fact does not imply 
that at least some changes in our beliefs must be regarded as abrupt changes in 
our language so that the arguments and considerations which led to these 
changes in our beliefs cannot be taken as performed within the (quasi-)same 
language. 

But the case of large changes in our beliefs during relatively short periods 
such as the case of scientific revolutions is much harder to deal with in my 
scheme, because my scheme needs a sequence of stable stages. To mitigate 
this difficulty, we can point out, following Harold Brown, that a scientific 
change starts within a definite problem situation, and that a scienttfic 
revolution is not a one-step process. There are many stages even during the 
period of a scientific revolution (Brown, pp. 139 & 144). I do not think, 
however, that we can absolutely deny that a scientist’s considerations and 
arguments can involve, or have actually involved, the abandonment of a not 
negligible portion of common beliefs at a single stage because of the seriousness 
of the difficulties in the previous theory. So, I have to water down the semantic 
constraint a little bit more. A situation may occur in which there is no way but 
to deny one or another central sentence of the theory, i.e., a sentence the denial 
of which demands the denial of many other sentences in the theory. In a case of 
this kind, what the scientist has to do in order to preserve the communicability 
with his colleagues is to give a detailed explanation of, or precise arguments 
for, the abandonment of common beliefs and his new proposal.? To do this 
means to make his innovation connected to the remaining parts of common 
beliefs and other common linguistic dispositions, which are the only 
background shared by himself and the other members of the community and, 
therefore, the only means he can make available to connect himself to the 
community and its tradition. Arguments at full length in conformity to the 
remaining parts of common dispositions may be able to restore the fluency of 


? Here, I have in mind a scientific community, as a sub-community of the whole linguistic 
community. We must here take into account Putnam's idea of the ‘division of linguistic labour’ 
(Putnam, p. 227). 


238 Nobuharu Tanji 


dialogue which is threatened by the loss of common beliefs. Conversely, if 
someone denies our common beliefs without being able to give a detailed 
explanation, he cannot be regarded as speaking the same language as ours. It 
should be, I think, counted not merely as a methodological principle but as a 
semantic requirement that the scientist give fully detailed arguments in such a 
case, If a physicist says that an electron is a wave, and if he cannot give any 
argument for the statement, when all other physicists believe that an electron 
is a particle in the literal sense of the word, then they will not be able to 
understand what he means. 

From the beginning, it was acknowledged that the general fluency of 
dialogue was a very vague criterion. It ts obvious that this criterion admits of 
degrees. What the speaker of a language is responsible for should be taken as a 
high level of fluency. In the case of a sclentist who is proposing a novel theory, 
the level of fluency is reduced so that he must say many things to recover a 
high level of fluency. The less we are fluent, the more we must say. That the 
criterion of membership in a linguistic community admits of degrees goes well 
with the fact that a child learns his language gradually, step by step, until he 
gets full membership. But the case of a scientist who is proposing a novel 
theory is, of course, quite different from the case of a child who has not 
sufficiently acquired the generally held beliefs and says something contrary to 
what we unanimously agree on. The difference lies in that the former, but not 
the latter, can give arguments for his deviation from the others, in conformity to 
the remaining parts of common dispositions. 

To summarize what I have said so far, the semantic requirement which 
should be satisfied by any talk within the quasi-same language can be roughly 
formulated as follows. 


For someone's talk to be within the quasi-same language as that of other 
members of the linguistic community, it is necessary that either (1) his 
talk is sufficiently in conformity with the speech dispositions shared by 
almost all members of the community, or (2) though he disagrees over a 
non-negligible part of common beliefs, he can given enough arguments, 
which in turn are tn accordance with the rest of common speech 
dispositions, to compensate for the lack of fluency of dialogue due to the 
disagreement. 


I shall call this requirement the 'Principle of Compensation' hereafter. 


3 ANALYTIC SENTENCES AND DYNAMICS OF LANGUAGE 


By appealing to the 'Principle of Compensation', introduced in the last section, 
we can give an explanation to Putnam's thesis that the sentence which 
expresses the criterion of a 'one-criterion word' (e.g., 'All bachelors axe 
unmarried men')—which I shall call the ‘criterial sentence’ of the one- 


Quine on Theory and Language 239 


criterion word hereafter—is immune to revision. (Putnam, Ch. 2) As Quine 
says, without the connection with ‘unmarried man’, ‘bachelor’ has ‘no very 
evident social determination, hence no utility in communication’ (Quine 
[1960], p. 56). In other words, the connection with ‘unmarried man’ is the 
only basic common disposition as to ‘bachelor’ which makes any talk 
containing ‘bachelor’ sensible. Then, it is impossible to revise the sentence ‘All 
bachelors are womarried men’ subject to the Principle of Compensation. For, if 
we are to cut the connection between ‘bachelor’ and ‘unmarried man’, there 
will remain no common speech disposition to be relied on to give an 
explanation of, or an argument for, the revision. About sentences which 
consist of ‘law-cluster words’ (see Putnam), on the contrary, arguments for the 
revision is possible. For, when one of the connections is severed, the other 
connections can serve as resources in giving arguments for it. 

This immunity to revision of the critertal sentence of a one-criterion word, 
however, should be taken as stated from the static point of view, i.e., stated 
about the language at one stage. Such characterizations of words as one- 
criterion words and law-cluster words are characterizations of words at each 
stage. It is conceivable, as Putnam shows, that a one-criterion word becomes a 
law-cluster word, and that the immunity to revision of the criterial sentence is 
lost. It may be discovered and come to be generally believed that all bachelors 
and only bachelors have a special kind of neurosis, a special kind of sexual 
frustration. Let us call it ‘bachelor neurosis’. And, suppose, due to the 
popularization of 'super psycho-analysis', we all have come to have an ability 
to tell quite easily whether someone suffers from bachelor neurosis or not. 
Then, it is not impossible that 'bachelor' will be more strongly connected with 
‘bachelor neurosis’ than with ‘unmarried man’, so that, if it eventually turns 
out that some people suffer from bachelor neurosis without being unmarried 
men, we may deny the sentence ‘All bachelors are unmarried men’! (Putnam, 
p. 58) Àn actual example of the change from a one-criterion word to a law- 
cluster word Putnam gives is that of the word ‘atom’ (ibid., p. 68). ‘Atom’ was 
once used as a one-criterion word whose criterial sentence was 'Atoms are 
indivisible'. But, after the long history of the word, it has become a law-cluster 
word, and moreover the sentence which used to be the criterial sentence of the 
word has been denied. 

Up to this point, I agree with Putnam. Indeed, I borrowed several ideas from 
him. But my story about the change from a one-criterion word to a law-cluster 
word is slightly but in an important respect different from Putnam's. The fable 
about bachelors' sexual frustration is actually given by him as a possible 
objection to his view. Putnam thinks that there are analytic sentences In a strict 
sense, ie., in the sense that they are permanently immune to revision, 
permanently, that is, so long as we use the same language. ‘All bachelors are 
unmarried men' is, according to him, an analytic sentence in this sense. 
Because ‘to say that an intention is to do something permanently is not the 


240 Nobuharu Tanji 


same as saying that the intention is permanent’ (ibid., p. 59), Putnam thinks, 
the possibility of a change of the kind envisaged in the fable of bachelor 
neurosis (ie., losing of the immunity to revision) can be accommodated by 
saying that we simply ceased to commit ourselves to the previous language. He 
says: 
If ‘bachelor’ ever becomes a ‘law-cluster’ word, then we shall simply have to 
admit that linguistic character of the word has changed. (Ibid., p. 68. My italics.) 


I suppose that Putnam would say the same about the word ‘atom’, because he 
gives this example just after the passage quoted above. If, however, the 
development of atomic theory should be said to be guided by experience and 
theoretical considerations, it does not do full justice to this fact to describe the 
change of the linguistic character of ‘atom’ simply as a linguistic change. If the 
non-transitivity of the quasi-sameness of language is right, we can suppose a 
chain of stable stages which connects the initial stage where ‘atom’ was a one- 
criterion word and the final stage where it is a law-cluster word, so that we 
need not assume a point of linguistic change from the internal point of view. In 
other words, we can say that at every stage the arguments and considerations 
were performed within the quasi-same language. 


4 OBSERVATION SENTENCES 


The picture of language I have depicted so far closely follows Quine’s ideas in 
Word and Object and other later writings. I take it that my picture may be seen 
as a slight articulation or a slight modification of what Quine seems to have 
meant particularly when he emphasized ‘the spirit of science and... the 
evolutionary spirit of ordinary language itself’, and when he likened our 
system of beliefs to Neurath's boat (Quine [1960], p. 4). But, about observation 
sentences, I should be rather critical to him. 

Quine seems to have two motives in his formulation of the model of 
language in his later writings. Oneis to preserve the main structure ofthe "Two 
Dogmas’ model, i.e., (1) the periphery of our system of beliefs as what makes 
the connection between experience and the system, and (2) revisability inside 
the system. The other motive is to bring in a stability element or 'the flxity 
needed in communication’ over the whole language, urged by the recognition of 
the social character of language. The effect of the second motive to the second 
component of the "Two Dogmas' model was a restriction on the revisability to 
avold the disruption of 'general fluency of dialogue'. What, then, about the first 
component, i.e., the periphery of the system? 

In 'Two Dogmas', although Quine stressed the role of the periphery as the 
only link between our system of bellefs and experience, he did not give a full 
characterization to the statements just on the periphery. In particular, it is very 
important but extremely unclear whether there are peripheral statements 


Quine on Theory and Language 241 


which are directly, and individually, connected with certain particular ranges 
of experience. 

One sentence, which is very often quoted, ‘Any statement can be held true 
come what may, if we make drastic enough adjustments elsewhere in the 
system’ (Quine [1951], p. 43) suggests, or even implies, that there is no such 
privileged statement. If, however, there is no privileged peripheral statement in 
the ‘Two Dogmas’ model, we cannot see why there can be a ‘recalcitrant 
experience’, which demands us to make revisions somewhere in the system. If 
no statement corresponds individually to any experience, then the whole 
system cannot have any connection with experience either. An experience 
must be described by a statement in order to be recalcitrant. For it is only when 
that statement cannot be added to the system without making it inconsistent 
that recalcitrancy arises at all. A statement (or a set of statements) can be 
inconsistent only with statements, and not with experience. If experience has 
some effect on the system even in a moderate way as the ‘boundary conditions’ 
(Quine [1951], p. 42), at least along the periphery there must be statements 
which correspond individually to certain ranges of experience. If there were no 
such privileged statements, our ‘conservatism’ and the demand for simplicity 
could be satisfied in the simplest way, namely, by revising nothing come what 
may. 

In Word and Object, Quine remedied the ambiguity about the periphery by 
introducing and clearly characterizing observation sentences. An observation 
sentence Is one on which all speakers of the language give the same verdict 
when given the same concurrent stimulation (Quine [1969], pp. 86 £)? To 
observation sentences, Quine gives the special status of ‘the cornerstone of 
semantics’ in view of their two fundamental roles, namely, as ‘the repository of 
evidence for scientific hypotheses’, and as ‘the only entry to a language’ (Quine 
[1969], pp. 88 £). He thinks that, for observation sentences, ‘the notion of 
stimulus meaning constitutes a reasonable notion of meaning’ (Quine [1960], 
p. 44). Then, it seems clear that changes in the stimulus meanings of 
observation sentences will be changes in the meanings of those sentences, and 
therefore, changes in language. In other words, the stimulus meanings of 
observation sentences cannot change as long as we use the same language. 
Observation sentences are, according to Quine, ‘sentences that members of the 
community are not apt to be led to disagree over by differences in their previous 
experience or in their theoretical speculations’ (Quine [1970a], p. 4). With 
respect to 'radical translation', the sameness (or significant approximation) of 
the stimulus meanings of observation sentences between the Jungle language 
and the linguist's home language is thought of as the starting point and the 


3 Strictly speaking, Qutne's notion of ‘observationality’ is a matter of degree. But the complexity 
arising from that can be avoided by, as Quine himself did, calling ‘observation sentences’ only 
those sentences which are highly observational. 


242 Nobuharu Tanji 


most objective foundations of radical translation (Quine [1970b], p. 179. See 
also Quine [1969], p. 89). 

It seems to me, however, that there may be cases in which, intuitively 
speaking, the most natural translation manual gives a sentence of our home 
language which is not observational or an observation sentence that has an 
entirely different sttmulus meaning for us from that of the original native 
observation sentence for them, and in which, intra-linguistically and diachro- 
nically, the stimulus meaning of an observation sentence changes overtly as a 
result of theoretical considerations and/or experience. And a difficulty 
concerning changes of language similar to that considered in Section 2 will 
arise. Consider the following examples. 

Suppose that a native observation sentence S has the same stimulus 
meaning as ‘Food’ (or ‘That is food’) in English. Then, of course, the linguist 
will translate S as ‘Food’ or ‘That is food’ (or ‘Food-stage’, 'Food-hood', etc.?), at 
an early stage of radical translation. But, as he proceeded in compiling his 
translation manual, let us suppose, the most natural translation of S has 
proved to be ‘The gift of God’, and he replaced his initial tentative translation 
‘Food’ by ‘The gift of God’. In this case, it will be wrong to reject the 
replacement on the ground that the stimulus meaning of $ for the natives is not 
the same as that of ‘The gift of God’ for the linguist. 

The problem here is that ‘theoreticity’ can penetrate into observation 
sentences. If we view the present case intra-linguistically, we can say that the 
natives may change someday the way in which they speak of food in ‘the spirit 
of science and... the evolutionary spirit of ordinary language itself’, We can 
give an actual example of this kind of change. In Japanese, ‘Hito-dama’ (that 
means ‘Human soul’ according to the normal translation manual from 
Japanese to English) was once an observation sentence whose (affirmative) 
stimulus meaning was the set of stimulations people had when they saw 
phosphorescence at a graveyard. But now, as a result of changes in their 
‘theories’, they do not all call phosphorescence 'Hito-dama'. Many people are 
still using this word (and therefore this one-word sentence) occasionally, but 
there are many other Japanese people who decline to do so. Thus, it has ceased 
to be an observation sentence. 

Quine admits that, if a native sentence is a stimulus-analytic one, the linguist 
can translate it as, e.g., ‘All rabbits are men reincarnate’ (Quine [1960], p. 69). 
And, I cannot see why the linguist should not translate an observation sentence 
that has a simple and salient structural relation to that sentence, and that has 
the same stimulus meaning as ‘Rabbit’ for him, as ‘Man reincarnate’. To take 
an actual and simpler intra-linguistic example, when we denied ‘A whale ts a 
fish’, the stimulus meaning of ‘Fish’ must have changed as a result of 
theoretical considerations and/or experience, and not because we simply 
changed the meaning. 

In ‘Epistemology Naturalized’, Quine touches upon ‘epistemological nihi- 


Quine on Theory and Language 243 


‘lism... reflected somewhat in the tendency of Polanyi, Kuhn, and the late 
Russell Hanson to belittle the role of evidence and to accentuate cultural 
relativism’ (Quine [1969], p. 87. Cf. Quine [1970a], p. 5), and criticizes 
Hanson’s contention that so-called observations vary from observer to 
observer with the amount of knowledge that the observers bring with them. 
Quine dismisses Hanson’s argument with a brief reply: ‘What counts as an 
observation sentence varies with the width of community considered. But we 
can always get an absolute standard by taking in all speakers of the language, 
or most’ (Quine [1969], p. 88). There is nothing wrong with Quine’s claim that 
one can neatly define observation sentences. My point is rather that he has 
passed by the real problem of the ‘theory-ladenness of observation’, and that 
this is his problem too.* 

I think that what is wrong with Quine’s contention is his presumption that 
the degree of observationality (the degree of agreement among the members of 
a linguistic community on the verdict in response to a short stimulation) and 
the degree of theoreticity (the degree of ‘theory- (or belief-)ladenness’) can be 
measured on the same scale. We should think that the scales for them are two 
separate scales. "The fixity needed in communication’ requires the community 
to share a body of beliefs. And as long as these beliefs are shared, it is entirely 
possible that the stimulus meanings of observation sentences for almost all 
speakers of the language coincide, even if the observation sentences in 
question are belief-laden. (Cf. 'Hito-dama'.) The stability and generality of 
beliefs can give aid to the stability and generality of observation sentences. 
Indeed, ‘Policeman’, and today ‘Prime Minister’ (thanks to the popularization 
of television) satisfy Quine’s criterion of observation sentences, and are, at the 
same time, heavily loaded with generally shared beliefs. And the same thing 
can be said, in my previous examples, about ‘The gift of God’, 'Hito-dama', 
'Man reincarnate', etc. 

Among observation sentences, Le., those sentences on which almost all 
speakers of the language give the same verdict when given the same 
concurrent stimulation, there may be, on the one hand, those which have, 
besides the connections with sensory stimuli, connections with the sentences 
expressing generally shared beliefs perhaps by sharing some 'theoretical' terms 
(belief-laden observational sentences), and, on the other hand, those which 
have no such connections (purely observational sentences). It should be clear 
now that a purely observational sentence is analogous to the criterial sentence 
of a one-criterion word in one respect. They are both characterized by the 
uniqueness of commonly shared speech disposition (of a certain kind). Just as 
the criterial sentence of a one-criterion word is immune to revision, so is the 
stimulus meaning of a purely observational sentence. And, just as a one- 
* In hts paper ‘Empirical Content’, Quine says that ‘we recognize the observation sentence to be 


theory-laden' (Quine [1981], p. 25). But he means by this nothing more than that 'terms 
embedded in observation sentences recur in the theory formulations'. 


244 Nobuharu Tanji 


criterion word can become a law-cluster word in the long term, a purely 
observational sentence can become a belief-laden one when people become 
socially to share beliefs expressed by (standing) sentences containing some 
terms which appear in that observation sentence. The characterizations of 
purely and belief-laden observational sentences are static (temporary) ones. 
Moreover, observationality itself is a static characteristic of a sentence (Cf. 
‘Hito-dama’ and ‘Prime Minister’). Just as a commonly shared belief involving 
law-cluster words can be revised subject to the Principle of Compensation, it 
should be thought that the connection of a belief-laden observational sentence 
with sensory stimuli (its stimulus meaning) can be revised subject to an 
appropriate extension of that principle. The extension of the Principle of 
Compensation, which suggests Itself, to cover the cases of disagreement over 
observation sentences is this: 


For someone's talk to be within the quasi-same language as that of other 
members of the linguistic community, it is necessary that either (1) his 
talk is sufficiently in conformity with the speech dispositions shared by 
almost all members of the community, or (2) though he disagrees over a 
non-negligible part of common beliefs and/or observation sentences, he can 
give enough arguments, which in turn are in accordance with the rest of 
common speech dispositions, to compensate for the lack of fluency of 
dialogue due to the disagreement. 


So long as a revision of the stimulus meaning of an observation sentence is 
under the constraint of this (extended) Principle of Compensation, we need not 
worry about the suspicion of 'epistemological nihilism', because, when 
someone disagrees over an observation sentence, the remaining part of 
common dispositions, with which the compensating arguments must be in 
conformity, includes the common stimulus meanings of other observation 
sentences. 


5 LEARNABILITY OF LANGUAGE 


In the picture of language I have tried to depict so far, the status of observation 
sentences is, unlike in Quine’s own picture, not privileged. As I noted above, 
the two reasons why Quine gives a privileged status to observation sentences 
are their roles as 'the repository of evidence' and as 'the only entry to a 
language'. But we have seen in the last section that the denial of the privileged 
status does not give rise to ‘epistemological nihilism’, and does not impair the 
first role of observation sentences Quine pointed out. 
In this section, I shall consider the second role. Quine says: 


Observation sentences are the ones we are in a position to learn to understand 
first, both as children and as field linguists . . . 

The observation sentence is the cornerstone of semantics. For it is, as we just 
saw, fundamental to the learning of meaning. (Quine [1969], p. 89) 


Quine on Theory and Language 245 


But from which part we start to learn our language is one question, and 
whether that part is unrevisable is another. What we learn as children or as 
linguists is the present stage of the language (perhaps including the present 
stimulus meaning of ‘Prime Minister’). The question of whether the stimulus 
meanings of observation sentences are permanently unrevisable (within the 
same language) has nothing to do with the possibility of our learning the 
present stimulus meanings of (present) observation sentences, provided that 
there is certain stability. It is true that we could never learn a language if there 
were no stability in it. But this is not only true about observation sentences but 
also true about the interior of language. What we alm at when we learn our 
language is to acquire all the speech dispositions which are at present shared by 
almost all members of the linguistic community. 

What, then, is it to understand a single sentence or a single expression? That 
is to acquire the set of common dispositions which are enough to talk with 
others using the sentence or expression in question. This may require many 
things to be learnt. Even to understand a one-criterion word such as 
‘bachelor’, it is not enough to acquire the disposition to assent unconditionally 
to its criterial sentence, ‘All bachelors are unmarried men’. For, through the 
words ‘unmarried’ and ‘man’, ‘bachelor’ has connections with a rather wide 
area of the language. A complete understanding of even ‘Mama’ or ‘Mother’ 
involves some sociological and biological beliefs. It may be the case that the 
area of the language which is to be learnt in order fully to understand a single 
sentence or expression is the whole language. Therefore, my view allows for 
the possibility of (though does not necessarily entail) a ‘holistic view’ of 
language. It is because his view is holistic that Quine stresses the importance of 
observation sentences as ‘the only entry to a language’ (Quine [1969], p. 89). 
Because, as Dummett says in his [1976] (p. 79), ‘on a holistic view, it ts 
impossible fully to. understand any sentence without knowing the entire 
language’, the question arises: How can we learn a language bit by bit? 


We may well have begun then to wonder whether meanings even of whole 
sentences (let alone shorter expressions) could reasonably be talked of at all, 
except relative to the other sentences of an inclusive theory. Such relativity 
would be awkward, since, conversely, the individual component sentences offer 
the only way into the theory. 

(Quine [1960], p. 34) 


Then it is very natural to invoke observation sentences, whose stimulus 
meanings are deemed as constituting 'a reasonable notion of meaning', to 
resolve this awkwardness. An observation sentence is thought learnable on its 
own, without learning the entire language. 

But, even if we could fully understand observation sentences in a theory- 
neutral or pre-theoretic way (which I deny), the question would remain 
untouched: How can we learn the other parts of language bit by bit? Since, on a 
holistic view, interior sentences cannot be ‘defined by’ or ‘reduced to’ 


246 Nobuharu Tanji 


observation sentences, even if we could understand observation sentences one 
by one, it would not be particularly helpful to understanding other sentences 
bit by bit. I think, therefore, that the problem of learnability (if there is ever 
such a problem) is not peculiarly related to observation sentences, but, as 
Dummett sees it, a problem of a holistic view itself. 

One of the reasons why Dummett opposes a holistic view of language is that 
he thinks that, on that view, ‘there can be nothing between not knowing the 
language at all and knowing it completely’ (Dummett [1976], p. 79). I think, 
however, that this consequence does not follow from the fact that ‘on a holistic 
view, it is impossible fully to understand any sentence without knowing the 
entire language', unless we presuppose that there is nothing between not 
understanding a single sentence or expression at all and understanding it 
completely. Why should we not say that children learn their language bit by 
bit not only in the sense of part by part but also 1n the sense that their 
understanding of a single sentence or expression also proceeds gradually. We 
can, I think, speak of a partial understanding of a single sentence or expression. 
On my account, it corresponds to a partial acquisition of the common speech 
dispositions needed for fluent talk involving the sentence or expression. Even 
the acquisition of one disposition is gradual. All these continuities are reflected 
by the continuity of the degrees of fluency of dialogue. It seems to me that the 
assumption that understanding of a single sentence or expression is a matter of 
all or none might tend to be connected with the view that understanding of it ts 
grasping some entity which is the meaning of it in one's mind (or head), and that 
this assumption might be a dangerous step towards the reification and 
mystification of ‘meaning’. 

If we admit of a partial understanding of a single sentence or expression, 
there can be various intermediate states between not knowing the language at 
all and knowing it completely, on a holistic view, even though, on that view, a 
complete understanding of a single sentence or expression requires the mastery 
of the entire language. Thus, there is no serious difficulty concerning the 
learnability of language in the revisability of the stimulus meanings of 
observation sentences (restricted by the Principle of Compensation) or In a 
holistic view of language in general. And we have already seen that we need 
not worry about the suspicion of ‘epistemological nihilism’, provided we are 
subject to the Principle of Compensation (see p. 244). Hence, there is no 
obstacle to abandoning the privileged status of observation sentences. 

A revision of the stimulus meaning of an observation sentence subject to the 
Principle of Compensation is possible only when it is a belief-laden observa- 
tional sentence. And the very existence of belief-laden observational sentences 
is possible on the strength of the stability of the interior properly captured in the 
Word and Object model. The Word and Object model seems to me to be an 
intermediate stage in the sense that it preserves an unnecessary relic of the 
"Two Dogmas' model, namely, the need for the special status of observation 


Quine on Theory and Language 247 


sentences. Quine can kick away the ladder of the unrevisability of the stimulus 
meanings of observation sentences after he has climbed up it. 


NOBUHARU TANJI 
Tokyo Metropolitan University 


REFERENCES 


Brown, H. I. [1977]: Perception, Theory and Commitment. The New Philosophy of Science. 
Chicago: The University of Chicago Press. 

Dummett, M. A. E. [1976]: ‘What is a Theory of Meaning (I1)', in Evans and McDowell. 

Dummett, M. A. E. [1978]: Truth and Other Enigmas. Cambridge, Mass.: Harvard 
University Press. 

Evans, G. and McDOWELL, J. (eds.) [1976]: Truth and Meaning. Oxford: Clarendon Press. 

FosrER, L. and Swanson, J. W. (eds.) [1970]: Experience and Theory. London: Duckworth. 

Putnam, H. [1975]: Mind, Language and Reality. Philosophical Papers, vol. 2. Cambridge: 
Cambridge University Press. 

QuiNz, W. V. O. [1951]: "Two Dogmas of Empiricism’, in Quine [1961]. 

Quine, W. V. O. [1960]: Word and Object. Cambridge, Mass.: M.I.T. Press. 

Quine, W. V. O. [1961]: From a Logical Point of View (2nd ed.). Cambridge, Mass.: 
Harvard University Press. 

Quine, W. V. O. [1969]: Ontological Relativity and Other Essays. New York: Columbia 
University Press. 

Quine, W. V. O. [19708]: ‘Grades of Theoreticity', in Foster and Swanson. 

Quine, W. V. O. [1970b]: ‘On the Reasons for Indeterminacy of Translation’, The Journal 
of Philosophy, 67, pp. 178-83. 

QuiNE, W. V. O. [1974]: The Roots of Reference. La Salle, UL: Open Court. 

QuiNg, W. V. O. [1981]: Theories and Things. Cambridge, Mass.: Harvard University 
Press. 

WITTGENSTEIN, L. [1969]: On Certainty. Oxford: Basil Blackwell. 


Brit. J. Phil. Scl. 40 (1989), 249-253 Printed in Great Britain 


DISCUSSION 


How to Defend Science Against Scepticism: 
A Reply to Barry Gower 


I In bis discussion paper ‘Chalmers on Method’, Barry Gower [1988] takes 
issue with my rejection of the notion of a universal, a historical account of 
scientific method appropriate for guiding the work of scientists or for 
appraising the theories that they produce. He labels my position ‘sceptical 
relativism’ and claims that I have not properly justified it. He suggests that my 
use of historical examples to reject universal method involves a naive 
falsificationist methodology that I reject for science itself. He further suggests 
that the notion of success involved in my claim that scientific standards are 
implicit in successful practices either commits me to an extreme form of 
relativism or to some universal standard of success that I otherwise claim to 
reject. According to Gower, changes in standards of the kind illustrated by my 
historical examples need not be taken as incompatible with universal method 
but as the rejection of a wrong method and its replacement by one that 
approximates more closely the correct universal method. Gower points to a 
connection between methodological rules and aims, rejects an extreme 
relativist account of aims and observes that Aristotelian aims can be rejected 
for being too limited and distorting, the tmplication being that these 
observations conflict with my position. 

In this reply I attempt to clarify my position and indicate how it can be 
defended in the face of Gower’s criticism. Some preliminary remarks will help 
set the scene, and will indicate a sense in which the anti-sceptical dispositions 
of Gower and myself coincide. It is clear that Gower is keen to defend science 
from sceptical attack. It is precisely because I share this view that I attempt to 
: offer a defence of science that does not involve an appeal to absolute standards 
inherent in some universal method. If the defence of science is portrayed as 
necessitating some such appeal then, when that defence fails, as I think it 
must, the way is open for the extreme sceptical alternative. Paul Feyerabend's 
task is made too easy. It is somewhat ironic to note that I received my copy of 
Barry Gower's article on the very day that I had completed an article [1988] 
defending the privileged epistemological status of science against the sceptical 
attack implicit in the writings of sociologists of science. I certainly do not intend 
my rejection of universal method to constitute a brand of sceptical relativisma: 
as Gower alleges. 





250 Alan Chalmers 


2 The rejection of universal method need not result in scepticism if due 
cognizance is taken of the aim of science. That aim is the production of 
knowledge of the world and here, as in the writings criticized by Gower, I avoid 
the special problems associated with attempts to gain knowledge of the social 
world by restricting the discussion to the natural sciences. I suggest that some 
uncontroversial assumptions can help us fill out this notion of the aim of 
science to a small degree. Once we reject the possibility of acquiring knowledge 
of the world a priori we are led to the recognition that the adequacy of any 
knowledge claim must be gauged by pitching it against the world in some way. 
If we further accept that knowledge is never produced from scratch, but 
emerges as theresult of modifications or additions to previous knowledge, then 
we can appreciate that the adequacy of a knowledge claim must be judged by 
the extent to which it is an improvement on what came before, with the 
understanding that the ‘improvement’ must be established by pitching the 
novel claims against the world. This is all extremely vague, of course, but it is 
sufficient to combat scepticism. Galileo's introduction of the telescope tnto 
atronomy can be interpreted as a practical demonstration of how the aim of 
Science could be served by abandoning previously accepted standards 
concerning what should count as evidence in sclence and replacing them by 
new ones. My [1985a] construal of that episode was designed not only to pose 
problems for those who wish to assume some substantive universal standards 
for science but also to counter Feyerabend's sceptical version. 

As Gower points out, aims and methodological rules are not unconnected. It 
could be argued that the aim of science as I have characterized it implies the 
methodological rule ‘attempt to gauge the merits of scientific theories by 
pitching them against the world’. If the acceptance of such a dictate constitutes 
allegiance to a universal method then I concede the case to Gower. However, 
as a glance at the volumes and discourses devoted to attempts to articulate 
some version of scientific method makes abundantly clear, advocates of 
universal method have something much more substantive in mind that the 
vague precept that I have suggested above. I claim that there is no substantive 
universal account of method adequate for science. 

An analogy that I learnt from Wal Suchting ([(1983], pp. 29-30) will help 
clarify the situation. Quality can be said to constitute part of the aim of 
manufacturers. It is not entirely vacuous to invoke manufacturers to aim for 
quality. However, when it comes to actual cases of quality control some 
substantive notion of quality control and procedures for testing for quality are 
necessary. What is more, specifications of quality and ways of measuring it will 
change as the manufacturing process is modifled or transformed. The absence 
of a universal characterization of quality does not lead to extreme relativism in 
the manufacturing industry even though the aims of producing manufactured 
products of quality ts much vaguer than the aim of producing knowledge as I 
have characterized it. When I claim [1986, p. 26] that there is no universal 


How to Defend Science Against Scepticism 251 


method or set of standards but that there are ‘historically contingent standards 
implicit in successful practices’ the success referred to is success at coming to 
grips with the world. 


3 A challenge that I put to the advocates of universal method and standards, 
implicit in my [1982, pp. 166-70] and made more explicit in my [1985b], is 
for them to indicate the way in which some substantive account of method is to 
be arrived at. It is implaustble to imagine that philosophers can construct such 
an account a priori, without reference to the actual practice of science. On the 
other hand, if an account of method is to be derived from empirical 
investigation of the practice of sclence other difficulties emerge. Firstly, if the 
account of method ts straightforwardly descriptive of what occurs in science 
then it will lack the normative force required of it by the advocates of method. 
Secondly, the favoured accounts of method do not withstand comparison with 
the practice of successful sclence. Thirdly, the historical evidence strongly 
suggests that the methods and standards of science change and evolve. It may 
be that my challenge can be met, but Gower gives no indication of how he 
thinks it can be met. He neither gives a hint of how a universal account of 
method is to be established, nor does he give an indication of the account of 
method that he advocates. 

The problem that Gower is up against in this respect is in evidence in his 
discussion paper. In a number of places Gower seems to accept the fact that 
methods and standards change. He refers, for example, to the ‘evident 
differences between Aristotle and Galileo about scientific method’. Gower 
' renders substantive changes in method compatible with universal method by 
adopting what he terms ‘methodological realism’, according to which there is 
a correct account of method which we do not possess but which we progress 
towards. Some aspects of Aristotle's method were simply wrong, Galileo's 
method was better and ours is better still in the sense of being closer to the true 
method. In my opinion the problems with methodological realism are 
manifold. Here I will mention only the problems it poses for Gower's case 
against ‘scepticism. If, as Gower suggests, an adequate universal method is 
something that we do not possess but is something that we aspire to, then it 
cannot be used now to combat scepticism or relativism. The respect in which 
one method can be said to be an improvement on another ts problematic for 
Gower. Since it is his view that a non-relativist account of the superiority of one 
theory over another requires appeal to some universal standard, so, it would 
seem, he must accept that a non-relativist account of the superiority of one 
method over another will require some higher order universal standard. 
Needless to say, there is no suggestion of what that standard is or how we 
might have access to it. 

My own defence of science against sceptical attack does not suffer from the 
difficulties that Gower and other defenders of universal method face. I can 


252 Alan Chalmers 


respond to my own challenge and defend my characterization of science. 
Firstly, my characterization of science and its aims is based on some very 
general assumptions such as the impossibility of a priori knowledge of the 
natural world and the insistence that thinking does not make it so. My position 
would need to be altered if such assumptions were shown to be faulty. 
Secondly, there would be no point to a particular characterization of science 
and specification of its aim if there was nothing in our epistemological 
repertoire that corresponded to that characterization or if progress towards the 
aim were unrealizable. I claim that much of what is typically regarded as 
scientific knowledge conforms to my characterization, and that the aim to 
produce knowledge and to gauge its adequacy by systematically pitching it 
against the world is indeed accomplishable and frequently accomplished 
within science. Thirdly, the sense in which my characterization of science and 
its aim is weakly descriptive or empirical does not deprive it of all normative 
force. Claims to science can be criticized as such for failure to contribute to its 
aim. Fourthly, changes in substantive scientiflc method and standards pose no 
threat to my position, since no substantive characterization of science is 
proposed. What is more, changes in standards and method can be appraised 
from the point of view of the extent to which those changes facilitate 
realization of the aim of science. 

The position I adopt is relativist in the sense that appraisals of theorles or 
methods are relative to the aim of science. However, here I side with Gower in 
resisting an extreme relativist construal of aims as arising from the whims or 
dispositions of individuals. If Gower's remarks that ‘aims themselves require 
justification’ and that ‘Aristotelian aims for sclentific inquiry have turned out 
to be too limiting and distorting’ are intended as a criticism of my position they 
are certainly misplaced. I would go further to say that the importance of the 
aim of producing scientific knowledge in our society is intimately bound up 
with the nature of that soclety itself and simply not a matter of individual 
choice. This is not to say that the aim of science must necessarily take 
precedence over other aims. It could well be argued that a more socially 
equitable use of the knowledge that we have is more urgent than the 
production of more knowledge in many contexts in contemporary society. The 
stand taken on such issues will inevitably be relative to the interests of 
individuals or classes. 


4 Mystudy ofGalileo's introduction of the telescope into astronomy [1985a] 
supports the view that these innovations involved a change in sclentific 
method and standards. I construe this as posing a problem for those who wish 
to defend the notion of a universal scientific method. According to Gower, this 
aspect of my argumentation involves me in using a naive falsificationist 
methodology that I reject within science itself. "There is, surely, something 
incongruous in the belief that the best methodologists, by paying attention to 


How to Defend Science Against Scepticism 253 


the historical facts about science, have come to the conclusion that the best 
scientists ignore facts.’ I do not think Gower's objection can be sustained. 

Gower formulates his argument in very general terms. He points out that if 
there is a prima facie clash between a methodology and some observations 
concerning scientific practice, then the rejection of the methodology ts not the 
only possible response, as a naive falsificationist might suppose. One might 
take issue with the adequacy of the characterization of the practice, one might 
interpret the methodology in a way that renders it consistent with the alleged 
counter-evidence, or one might deny that the practice in question is legitimate 
science. I totally accept Gower's point at this general level. But merely making 
it in this general way does not constitute a rebuttal of my case any more than a 
suitably revised version of Gower's argument can be taken as denying that 
facts can ever tell against a theory. If Gower's objection to my claim that 
Galileo's introduction of the telescope into astronomy poses problems to 
defenders of universal method is to have any force he must give some 
substance to it. He could challenge my account of the episode, he could offer an 
account of universal method consistent with it or he could, although I am sure 
he would not, deny that Galileo's innovation constituted a scientific advance. 
But Gower does none of these things. Apart from noting, with some 
justification, that I misuse a quotation from Aristotle, he offers no criticism of 
my case study whatsoever, nor does he offer an account of method consistent 
with it. Were he to do so, and were I to fail to respond to his criticism, then he 
would have grounds for labelling my position naive falsificationist. As his case 
stands, I do not see that he has grounds for that claim. 


ALAN CHALMERS 
University of Sydney 


REFERENCES 


CHALMERS, A. [1982]: What is this thing called science?, 2nd edition. Open University 
Press. 

CHALMERS, A. [19852]: ‘Galileo’s Telescopic Observations of Venus and Mars’, British 
Journal for the Philosophy of Science, 36, pp. 175-83. 

CHALMERS, A. [1985b]: ‘The Case Against a Universal A-historical Account of Scientific 
Method', Bulletin of Science, Technology and Soclety, 5, pp. 555-67. 

CHALMERS, A. [1986]: "The Galileo That Feyerabend Missed: An Improved Case Against 
Method', in J. A. Schuster and R. R. Yeo (eds), The Politics and Rhetoric of Scientific 
Method, D. Reidel, pp. 1-31. 

CHALMERS, A. [1988]: "The Sociology of Knowledge and the Epistemological Status of 
Science’ forthcoming in Thesis XL 

Gower, B. [1988]: ‘Chalmers on Method’, forthcoming in British Journal for the 
Philosophy of Science. 

SucuriNG, W. [1983]: ‘Knowledge and Practice: Towards a Marxist Critique of 
Traditional Epistemology’, Science and Society, 47, pp. 2-36. 


Brit. J. Phil. Sci. 40 (1989), 255-259 Printed in Great Britain 


DISCUSSION 


Testing for Convergent Realism 


In ‘A Confutation of Convergent Realism’, Larry Laudan presents the realist 
with these fascinating challenges: 


What, then, of realism itself as a ‘scientific’ hypothesis? . . . If realism has made 
some novel predictions or been subjected to carefully controlled tests, one does 
not learn about it from the literature of contemporary realism. ([1981], p. 46). 


He then goes on to say: 


No proponent of realism has sought to show that realism satisfies those stringent 
empirical demands which the realist himself minimally insists on when 
appraising scientific theories. ([1981], p. 46). 


Not only do I fully endorse what Laudan says in these passages, I think it is time 
to take him up on these challenges. 

Laudan’s position has been that scientific progress can be explained without 
having to appeal to verisimilitude. Instead, we explain the success of a 
scientific theory in terms of its subjugation to the cannons of experimental 
controls: those theories that have been put through proper testing during the 
course of their formulation are the ones that are most likely to be reliable. 
([1984], pp. 96-101) 

He goes on to claim that, unlike the realist hypothesis, his explanation is 
open to testing, for it can be used to make predictions. For example, 


. .. where there are individuals or whole societies which shape their beliefs without the 
controls associated with science, those beliefs will be less reliable on the whole than the 
beliefs of a “scientific” culture. ([1984], p. 101) 


So, while realism can not satisfy 'stringent empirical demands', Laudan's rival 
hypothesis has the virtue of being open to controlled experimentation. 

According to Laudan, then, we must be able to test the connection between 
reliability and shaping beliefs in accordance with scientific controls. By this 
means we can independently show that there is a relationship between them. 
Otherwise, Laudan would beg the question in exactly the same way he and 
Fine argue that the realist's use of inference to the best explanation begs the 
question. His strategy—an excellent one at that—ts to look and see if theories 
that satisfy the rules of scientific testing are, in fact, more reliable that those 
that do not. If so, we can explain progress in terms of scientiflcally regulated 
research, without having appeal to verisimilitude. But why can’t such a testing 
strategy be applied to the realist's hypothesis as well? 


256 Jerrold L. Aronson 


The convergent realist maintains that as our theortes lead to better 
predictions and greater accuracy, they depict nature more accurately. The 
anti-realist holds, on the other hand, that it is a real possibility for a series of 
highly successful hypotheses to end up wandering further from the truth. How 
can we set up a crucial experiment to decide between these rival hypotheses? 
The basis for such an experiment rests on something the antt-realist’s position 
completely overlooks: we construct and confirm innumerable theories all the 
time about observable things, theories which can be directly observed to be true or 
false. It's just that we often construct theories about observables prior to being 
in position to directly observe their truth value. Most importantly, however, 
we often make predictions (and measurements) on the basis of these theories, 
again, before we can go to the trouble to directly observe whether or not they 
are true. 

Let’s consider this case in point. Something goes wrong with your car; 
suppose you have trouble starting it. You call your mechanic and describe the 
symptoms to him. He hypothesizes that something is wrong with the starter 
and asks you to try to do other things to start the car. On the basis of his 
hypothesis, he predicts what will happen and, sure enough, he’s right. He then 
goes even further by theorizing what's wrong with the starter and asks you to 
try even more things and he’s right, again, about what subsequently happens. 
Let’s say that he claims that a particular part in the starter is worn. Remember, 
your mechanic is doing this over the telephone. Even so, you have the 
impression that he’s zeroing in on the truth about your car. You take the car 
in, the starter is removed, that particular part is taken out and examined and, 
sure enough, it’s worn beyond repair. The part is replaced, the starter is put 
back in the engine and the car starts like new.! What this example shows, 
besides my ignorance of automobile mechanics, is that a hypothesis can later 
be observed to be true—and, hence, cease to be an hypothesis—after it has been 
used to make many successful predictions.” 


1 It turns out that there are automobile mechanics on call-in radio programs where things like 
this actually happen. For example, one such program can be heard on National Public Radio's 
‘Weekend Edition’ in the United States. What happens is that the caller gives the mechanics the 
year and make of his car and the symptoms. They, in turn, ask the caller what happened when 
certain things were done to his car. On the basis of his answers, the radio mechanics ‘bet’ what's 
wrong with the car, give an estimate of the rapair costs, all of this being open to future 
confirmation by the caller's mechanic. 

? It ts interesting to compare this example with Laudan's car mechanic case (1984, pp. 97-8). His 
mechanic irrattonally explains the failure of his car to start on cold mornings in terms of the 
brakes (whether or not they actually need rapair) because whenever they are replaced (In a 
warm garage) the car readily starts. In the first place, there is Just one hypothesis here, one 
which does not successfully predict anything new about the car's behavior. Contrast this with a 
series of hypotheses, with each successive hypothesis yielding more predictions. For example, 
the mechanic could guess that something is wrong with the carburetor, pinpointing the 
problem to a sticky butterfly valve. The car, then, should stall under a variety of circumstances. 
That the valve is sticking can be directly observed and, scepticism aside, we can tell if that was 
the cause of the starting problem (and others). 


Testing for Convergent Realism 257 


Surely, Laudan and van Fraassen can’t deny that there are many theories 
about observables the truth of which can eventually be settled directly by 
observation. So, why can’t we use these theories as a basis for a crucial 
experiment to decide between the realist and anti-realist hypotheses? How can 
we test them? By taking theorles which can eventually be observed to be true 
or false and testing to see if any false theories can systematically lead to better 
predictions and measurements in spite of their being false. It’s my contention that 
the anti-realist's position will be shown to be empirically false by such testing. 

Is it a real possibility that a progressive series of hypotheses about a given 
subject matter could wander from the truth, even if predictions, measurements 
and inventions get better with each successive hypothesis? Let's put it to the 
test, then. Instead of considering the contentious cases in the history of science 
that, for example, Laudan cites, cases such as phlogiston and the ether, we 
should set up some type of controlled experimentation. After all, his scepticism 
about realism is based on the claim that, for each case of sclentific progress, it is 
possible that we are getting further from the truth. But how can he empirically 
support such a possibility? It seems that he feels that such a possibility is well 
established by listing cases that have actually occurred in the history of 
sclence. However, these cases are contentious at best. My point is why wait for 
these cases to occur in the history of science when we should be able to produce 
them right now if, as the anti-realist claims, they are a real possibility? AJl I am 
proposing, then, is a fair test of his hypothesis, a crucial experiment on the 
connection between progress and verisimilitude. 

So, let's see if this issue can be empirically settled. Prepare a series of black 
boxes. The preparers will try to design the inside of each box in such a way that 
Its observable behavior will systematically mislead a researcher into formulat- 
ing false hypotheses about what's going on inside the box. But the preparer 
isn't just trying to get the researcher to come up with a false hypothesis in the 
beginning. That would be too easy. The preparer wants the examiner to come 
up with a series of (more refined) hypotheses about the box, hypotheses that 
end up going astray, even though he has been making better and better predictions 
about the behavior of the box all the while. We can then play this little game, one 
very much like Turing's imitation game; only, this time, we trick the scientist 
into making false hypotheses about the box while believing that he is getting 
closer to the truth since his hypotheses are yielding better predictions. If we are 
clever enough to succeed, then we have evidence for the anti-realist 
hypothesis, for all the researcher actually achieved was to come up with a 
series of hypotheses which only saved appearances but failed to get at the truth 
about the inside of the box. After the boxes are constructed, we seal them and 
have scientists examine each one. While they cannot open the box and observe 
what's inside, they can use any means at their disposal to test the behavior of 
each box. 

Then we gather the fruits of their labor. We are only interested 1n those 


258 Jerrold L. Aronson 


hyotheses that yield better and better predictions about the behavior of the box 
being examined. The judges compare each such hypothesis with reality by 
opening the corresponding box. Now all we have to do is determine the ratio of 
cases where a successful series of hypotheses have actually strayed further 
from the truth over the total number of successful series of hypotheses.? 

I'll bet that the ratio will approach zero. My reason for this is that we are all 
familiar with cases of everyday hypotheses about observables which are highly 
successful and are subsequently checked out by direct observation. I don’t 
think any of us can come up with a case where a highly successful series of 
hypotheses, with each successive hypothesis yielding better predictions, 
actually ends up getting further from the truth in the long run. So, why 
shouldn’t we believe that a highly successful series of hypotheses approximate 
the truth, when the only difference is that we can't observe its verisimilitude? 

In conclusion, I have argued that we can test for the relationship between 
verisimilitude and success or reliability, and that such a test is open to 
experimental controls. We can use these controls to establish how fair the tests 
were, how good the samples were, etc., in exactly the same way Laudan uses 
them to test his rival explanation of scientific progress. 

Once it is empirically established that there is a connection between progress 
and verisimilitude, we can finally explain the former in terms of the latter 
without being guilty of begging the question. How does this effect Laudan's 
explanation of progress? While I firmly believe that he is right about there 
being a connection between reliability and passing tests that are conducted in 
accordance with experimental controls, I content that we do not explain the 
former in terms of the latter. Rather, the two are explalned by appealing to a 
common ‘cause’: the reason why a series of hypotheses passes tests that are 
conducted under the appropriate experimental controls and yields better 
predictions and measurements is that each successive member of the series is 
getting closer to the truth.* 

JERROLD L. ARONSON 

Department of Philosophy 

State University of New York at Binghamton 

3 This ‘game of sclence' is intended to be more than just a thought experiment. All too often, the 

apprenticeship of a sclentist involves repeating experiments of the past, experiments of which 

the results are already known to the student. Wouldn't it be better to have the science student 

‘tinker’ with nature, Instead, by performing black box experiments, where the contents of the 

box are unknown to him? What I am proposing, then, is a program of working with hypothesis- 
formulating-simulators as an integral part of the training of scientists. 

Suppose students undergo such training. Their teacher could discuss their strategies for 
discovering what's inside the box and if the student ends up off the mark, they can work out 
what went wrong in the course of hypothesis formulation. I contend that this would be a better 
way to train scientists than the traditional approach, especially since the latter tempts the 
student to 'fudge' the data in order to get the desired, known results. 

Notice how, in the above cases, the judges can actually observe If the scientist is getting closer to 


the truth because, of course, the judges know what's inside the boxes. This way, we avoid the 
problems associated with developing a semantics of verisimilitude. 


» 


Testing for Convergent Realism 259 


REFERENCES 


LAUDAN, Larry, [1981]. ‘A Confutation of Convergent Realism’, Philosophy of Science, 
Vol. 48, No. 1. 

LAUDAN, LARRY [1984]. 'Explaining the Success of Sclence', in J. T. Cushing, C. F. 
Delaney and G. M. Cutting (eds.), Science and Reality, Notre Dame: University of 
Notre Dame Press. 


Brit. J. Phil. Sci. 40 (1989), 261-273 Printed in Great Britain 


DISCUSSION 


Ruben and the Metaphysics 
of the Social World 


X Are there Irreducible Social Entities? 
2 Social Properties and their Basis? 
3 Individualism and Explanation 


I ARE THERE IRREDUCIBLE SOCIAL ENTITIES? 


I.I In his recent book The Metaphysics of the Social World Ruben tries to 
defend a holistic position about the nature of social entities. I shall below 
present a critical review of the most basic issues and arguments of the book. 
The book is clearly written and gives detailed argumentation of the crucial 
issues. In my view it is the best monograph so far written on the topic in 
question. 

But perhaps I should say right at the beginning that it seems that I do not 
share his most basic intuitions and that I do not think he has been quite 
successful in his arguments. But the reader may and should of course judge 
Ruben’s success for himself after reading my criticisms. 

In the first chapter of the book Ruben argues against the Individualist's 
thesis that there are no irreducible social entities. The basic argument is stated 
for the case of social substances in terms of an example as follows: 


(1) The belief that France is a charter member of the United Nations is 
literally true. 

(2) The belief in (1) seems to require the existence of France as that which 
the belief is about. 

(3) The belief in (1) is not paraphrasable by, or logically equivalent to, any 
belief that does not require the existence of France for its truth. 

(4) There are no acceptable candidates with which France may Bs 
reductively identified. 

(5) Therefore: There is at least one irreducible social substance, France. 


However, I do not regard Ruben’s above argument as a sound one. For first, . 
Ruben is only concerned with reductive ‘paraphrasing’ and ‘identification’. 
But I think it is unwarranted to leave out of consideration the possibility that 


262 Raimo Tuomela 


the ultimately right approach is an eliminative one. Accordingly, I shall not 
below pay much attention to the truth of premiss (4), which Ruben almost 
exclusively concentrates on.! 

In my view the above argument fails primarily because of the falsity or at 
least shakiness of its premisses (1) and (2). Let us start by considering (1). First, 
there is a small pedantic point to make. I presume Ruben does not mean to 
speak of anyone's state of believing having the content that France is so-and- 
so. Rather he means to speak of a content of a sentence which is a potential 
object of belief and of which he claims that it is literally true. So it seems that (1) 
is paraphrasable as 


(1^) The sentence ‘France is a charter member of the United Nations’ is literally 
true. 


Now we face the problems of what 'true' and 'literally true' mean. If 'true' 
refers to some kind of correspondence truth then there must of course be 
something in the world making (1’) true, and truth in the literal sense, as 
opposed to metaphorical truth, should mean that the truthmakers are some 
kind of elements ofthe actual world, I suppose. Alternatively, 'true' can refer to 
some kind of intralinguistic or pragmatic notion, such as semantic assertabi- 
lity, warranted assertability, or something of the kind. Then there is no 
assumption of the existence of truthmakers in any correspondence sense. If 
this second option is adopted, then of course Ruben's argument does not even 
get off the ground, for his premiss (2) then Is clearly false. (And then we do not 
need to discuss premisses (3) and (4) at all.) 

I.2 Let us now consider Ruben's above argument by interpreting truth as 
(material) correspondence truth in a Tarskian sense entailing existence in 
the sense, for instance, (2) states. Now I can grant that under a naive 
commonsensical or 'folk sociological' interpretation (1) and (2) both seem 
true. (Note, by the way, that in (2) we need 'requires' instead of 'seems to 
require' for Ruben's argument to be even formally valid.) It should be 
emphasized that the word ‘France’ is vague and ambiguous—a point explicitly 
granted by Ruben. It may be used in the sense of international law, in the sense 
of a historical nation, in the sense of a geographic region or in the sense of a 
society, and so on. I shall not here try to botanize the various meanings of 
‘France’, but I consider it to be important to keep the different meanings 


1 Ruben has pointed out to me a difficulty with his discussion related to premise (4). It is that he ^ 
somewhat questionably treats names like ‘France’ as rigid designators and assumes they 
necessarily denote the same entity (e.g. set, depending on the case under scrutiny) in every 
possible world. But, as in effect argued by Gibbard [1975], one can perhaps succeed in treating 
such names as non-rigid designators possibly denoting different entities in different possible 
worlds. If this ts acceptable, e.g. Ruben’s argument relying counterfactuals on pp. 17-18 is not 
sound, for then, for instance, counterfactuals such as the one that Bismarck might have been a 
national of France can come out true (contrary to Ruben's claim tn the book). 


Ruben and the Metaphysics of the Social World 263 


distinct. Fortunately, though, my points below do not really depend on this 
matter. i 

Even if we were to go with this commonsensical way of thought, we need not 
think (at least immediately) that there must be a holistic entity, France, serving 
to make (1) true. To assume so would be not only naive but also partly 
question-begging, as the purpose of the argument is just to show that there are 
such social entities, which furthermore are taken in the conclusion to be 
irreducible. Thus, we cannot assume at least without proper and detailed 
argumentation what (2) prima facie claims, viz. the existence of the holistic 
entity France. But Ruben does not tell us what kind of substance France really 
is supposed to be. What would a world have to be like if there were no such 
substance (over and above individualistically acceptable individuals and 
relationships) and what difference would the addition of the entity France then 
make? Ruben does not clarify these points. Nor does he present relevant 
arguments. This is a clear lacuna, because much of his further argumentation 
in the book relies on the mentioned shaky conclusion that there is a (holistic) 
social substance, France, which is supposed to be a candidate for individualis- 
tic reduction. (Ruben also relies on various commonsensical intuitions 
concerning the properties of such wholes, but this is dubious as there surely are 
conflicting intuitions about them.) If, however, we do not accept (2) in Ruben's 
sense, we are of course free to think up other possibilities not open to Ruben. 

I have still the further criticism against (1) that both the naive and the more 
sophisticated commonsensical approaches may fail after all. Accordingly, I 
would like to treat the matter of truth in terms of best-explaining theories (see 
Tuomela, 1985, for arguments for this position). Thus, my claim becomes that 
(1) can be true only if a holistic social theory is going to give the best 
explanation of social phenomena. Concentrating on a fixed sense of ‘France’, 
(1) (and (1’)) will simply turn out to be false if the best-explaining theory will 
not postulate holistic social substance such as France (cf. Stich's, 1983, 
arguments much along these lines). 

Premiss (2) is also very problematic, as already seen. In addition to the 
mentioned criticisms of it, it seems to me that Ruben can be charged with an 
uncritical acceptance of a rejectable form of the so-called linguistic Myth of the 
Given. According to it language is somehow logically tied to the world: if the 
holistic word ‘France’ is used it (unless reducible or eliminable) somehow has 
got to refer to a holistic entity, viz. France holistically understood. This is just 
the mentioned naive commonsensical idea about how to connect language 
with the world. My critical clalm can also be supported by referring to Ruben's 
definition of an entity: ‘entities are what true statements or beliefs are about’. 
But the Myth of the Given ought to be rejected, as it leads to endless confusions 
and difficulties (see e.g. Tuomela, 1985, Chapter 3, for a discussion). And, if 
Ruben indeed is committed to this doctrine, we have here another reason for 
rejecting premiss (2). 


264 Raimo Tuomela 


3 My only comment on Ruben's premiss (4) relates to what I take to be the 
most interesting attempt at reductive identification of those considered by 
Ruben. This attempt is psychologism of a kind. While I take this doctrine to 
contain an important germ of truth I think that, precisely because of its 
restriction to mental states only, it is not correct. Let us thus consider Ruben’s 
criticisms of the general proposal to identify France reductively with relevant 
people's having certain beliefs or attitudes. The individualist’s idea here is that 
the reality which underlies our belief in the existence of things like France 
is simply the psychological reality of human beings (mutually) thinking 
thoughts or have attitudes about something, and that things like France exist 
just in so far as these psychological states do. Ruben thinks that the relevant 
attitudes must be about France. He offers two arguments against the possibility 
of reductive identification of France with any group of persons who hold beliefs 
about France. 

The first argument presupposes that the beliefs in question are singular. And 
what is much more, Ruben takes them to be de re and thus to entail the 
existence of France. If so, the supposed reductive identification becomes 
circular, claims Ruben. This seems to be because in that case we are assumed 
to be dealing with beliefs of the form (Ex) ((x=France) & B((x=France)&...)), 
and if such beliefs occur in the analysans of France we obviously go in a circle. 
For the analysans not only explicitly states the existence of France but does it 
by using the very term ‘France’. But now the question arises whether that very 
term really is needed. Perhaps we could use a definite description such as ‘the 
land of de Gaulle’ instead. Then we could perhaps escape conceptual reference 
to France. But still France would persistently be there... 

Ruben also considers de dicto beliefs, general ones. They may be taken to 
have the form B((Ex)(x —France)& . . .). But he tries to claim that they must in 
effect entail the corresponding de re beliefs, for otherwise reductive identifica- 
tion of France would not be possible. This is because reductive identification 
presupposes that the redusandum exists: 'a necessary condition for reductively 
identifying France with any entity whatever is that France exists’ (p. 39). And 
this is backed by step (2) in Ruben's argument. Perhaps we can say that this is 
fine as far as it goes—-even if it sounds a little bit as cheating. For, except for 
some remarks, Ruben does not seriously consider the possibility that the 
putatively existing social entity can be eliminated. That is, it remains a 
possibility that while France exists relative to the framework of common sense, 
it does not really exist, viz. exist relative to the best-explaining theory about the 
social realm (relative to a disambiguated sense of ‘France’). And it is this latter 
possibility that I am most concerned with. Ruben's (1985) Chapter 1 is 
therefore somewhat disappointing as it deals so exclusively with only reductive 
identification and, in my vlew, too lightly dismisses elimination (due to step (2) 
in the argument?). . 

There are at least the following two alternatives available to one who rejects 


Ruben and the Metaphysics of the Social World 265 


the Myth of the Given and the above approach taken by Ruben. The first one, 
mentioned above, is to analyse the truth of beliefs as a pragmatist does roughly 
in terms of what is semantically assertable in virtue of the rules of language or 
what is justified or warranted to believe in one's society. The other possibility is 
to define the truth-makers for people's beliefs in terms of the psycho- 
sociolinguistic language-world rules (embodying factual generalizations) that 
they in fact obey. And if the best-explaining ‘social psychological’ theory 
analyses this in terms of individualistic entities (rather than holistic ones), 
then-—to the extent it will postulate anything like folk psychological beliefs at 
all—the real entities that beliefs are going to be about will ultimately be 
individualistically construed ones. For instance, putting the matter vaguely in 
a commonsensical way, we might understand talk about France as something 
having as its ‘truthmakers’ persons, pieces of land, buildings, cows, cases of 
Camembert, bottles of Chateau Lafltte, and what have you. Alternatively social 
wholes might be construed, in the manner of Copp (1984), as aggregates 
consisting of mereological sums of temporal stages of persons (and perhaps 
other individualistically acceptable entities). In any case, my general conclu- 
sion about Ruben's above argument for the existence of irreducible social 
entities is that it is not a sound one. 


2 SOCIAL PROPERTIES AND THEIR BASIS 


2.1 In Chapter 3 of his book Ruben defends property holism, viz. the view 
that there are irreducible social properties. Thus, his basic question here is: Can 
social properties be reductively identified with mental properties (perhaps in 
combination with material properties)? Ruben claims on p. 87 that 'If an entity 
(a nation or group, for example) is irreducibly social, then no mental or 
material property is true of it'. Furthermore, he claims that if an entity is 
irreducibly mental, then no material property can be true of it. But there could 
be social properties true of minds. If an entity is irreducibly material, both 
social and mental properties can be true of it. These are interesting claims 
which seem not to be obviously true and which at any rate are worth a longer 
discussion than the brief remarks Ruben makes (cf. pp. 87-8). 

Ruben presents three arguments for the existence of irreducible social 
properties. The first goes as follows (p. 90): 


(O) There are irreducible social entities. But, every irreducible social entity 
must possess at least one irreducible social property. Therefore, there are 
trreducible social properties. 


This argument connects entity holism and property holism, assuming the 
truth of the former as a premise. But, we have above found reason seriously to 
doubt—4f not reject—entity holism. For this reason and because Ruben does 


266 Raimo Tuomela 


not himself want to place much weight on this argument, I shall not here 
discuss it further. 

Ruben’s second argument for property holism goes in terms of the argument 
from alternative realizations: 


(1) The argument from alternative realizations applies to social properties, for, 
due to their conventional nature, there are no limits as to how they can be 
realized in terms of mental and material properties. 

I shall not here spell out the argument from alternative realizations in full 
(see Ruben, p. 95). The basic idea is however that a social property can be 
instantiated in so many different ways that there is no disjunction of mental 
and material properties which is both necessary and sufficient for the social 
property in question. Ruben thinks that in the case of attempts to reduce 
mental properties to material ones the corresponding argument does not 
work—basically because it cannot be a prior! known that no disjunction of 
material properties is nomologically necessary and sufficient for a given 
mental property. But, in the case of social properties we can argue on a priori 
grounds that no such mental and material barriers for social properties can be 
found. Consider, for instance, the social property of being a mayor. Ruben's 
basic argument is this (p. 105): 


One might believe that which properties were nomologically sufficient for being a 
mayor was, in part, something conventional, something not wholly dictated to 
us by the material or mental structure of reality. Human beings have a part in 
choosing what the material and mental world has to be like, such that being that 
way makes it true that someone is a mayor. It isn’t just in virtue of the laws of 
nature that certain arrangements of mind and matter are sufficient for the 
realization of social properties. It's also in virtue of how we decide to construct the 
social world. 


I think that it is tmportant to see that the social world is in an important 
sense conventional. But the trouble with Ruben’s point is that also mental 
properties are in part conventional, indeed conventions (explicit or tacit 
agreements) are something characterized by mental predicates, we can say. If 
this is granted, conventionality does not deliver what Ruben thinks it does. As 
a matter of fact, if both the mental and the social are conventional that speaks 
for their connectability rather than against it. While Ruben surely can accept 
connection in some loose sense but not e.g. in the sense of necessary and 
sufficient connections (which now could not be nomological but conventional 
and conceptual in nature). The notion of supervenience actually gives an 
interesting possibility for connecting the social with the mental (and material). 
I have argued in Tuomela (1987) in the case of action predicates that 
supervenience in this case must be so strong as to deliver even a coextensive 
necessary and sufficient microbasis, viz. disjunction of micropredicates, i.e. 
action predicates applied to individual agents, for every macro-social action 


Ruben and the Metaphysics of the Social World 267 


predicate applied to a collective. I conclude that, although the matter surely is 
open for further discussion, Ruben's present argument for the applicability of 
the argument from alternative realizations, based on the conventional nature 
of social properties, does not seem to work. 

2.2 Ruben characterizes the notion of social property by reference to nested 
systems of belief: a property is a social property if and only if from the fact that P 
applies to anything, it follows that a nested system of beliefs and expectations 
exists (p. 118). Ruben's Gricean-type account is interesting and defensible, I 
think. I shall not here describe the somewhat complicated notion of a nested 
system of beliefs, involving e.g. social loop beliefs. 

One interesting thing to note about Ruben's notion of a social property is 
that it is a relational one: if any social property ts true of anything, at least two 
persons exist, for the notion of a nested system of beliefs requires that. And, 
note, too, that there can be social psychological relations which are not social: 
a person can envy or admire another one without there existing a nested 
system of beliefs of the kind required for a social property. 

Social properttes can be variable or non-variable. A social property is non- 
variable if and only if there is some specific system of beliefs and expectations 
that must exist whenever the property is true of something (p. 119). For 
instance, the British custom of drinking tea at breakfast is a non-variable social 
property, Ruben claims. On the other hand the property of being a mayor is a 
variable social property, for ‘when someone is a mayor, there must be some 
nested system of beliefs and expectations widely shared'. 'But there are few, if 
any, restrictions on the range of beliefs a society could expect a mayor to do' 
(p. 120). 

Now we can go to Ruben's third argument for the existence of irreducible 
social properties (p. 121): 


(2) Even if available, analyses of social properties in terms of mental (and 
material) properties would not be reductive. 


In order to argue for this Ruben needs still another distinction (p. 123): 


I call a social property ‘weakly soctal’ when all of the beliefs or expectations in the 
assoclated nested system have propositional objects of the form 'Someone will F', 
where 'F' specifies a non-social action type. A social property is strongly social 
when some of the associated beliefs and expectations have propositional objects 
involving a social action type. 


Suppose, for instance, that being a mayor involves that it is believed that he 
will give executive orders. Ruben correctly points out that, while giving an 
executive order is a social property, expecting that a mayor will give executive 
orders is only a mental property. Beliefs contexts are nonextensional and from 
the belief in question it does not follow that there are executive orders or even 
that there are mayors. 

But, Ruben continues on p. 124, 


268 Raimo Tuomela 


although beliefs and expectations are nonextensional with regard to objects or 
token acts or events, they are not nonextensional with regard to properties or 
action types. If I have an intelligible belief that x is P, it does not follow that there 
is an x, but it does follow that there is a property P such that I believe that x has it. 
This is, of course, compatible with believing that properties are only sets, of 
concepts, or anything else one might like. 


I shall here accept the (not unproblematic) claim that the existence of a 
(possibly uninstantiated) property P follows. But it also seems to me that by 
saying that P could be e.g. a concept he has watered down his idea. For many 
forms of individualism are compatible with the existence of irreducible social 
concepts. Ontologically Ruben's liberal statement is compatible with social 
properties being in a person's thoughts, heads, linguistic and other actions— 
depending on how concepts are understood ontologically. But one wonders 
what reasons there could be for Ruben to water down his holism in this way. I 
think, on the contrary, that Ruben's overall position would better have fitted a 
stronger view of what properties are (e.g. some form of Platonism on Popperian 
third-worldianism). But if Ruben does not want to take stand here, that ts all 
right, of course.? 

Even if a commitment to social properties in the above sense is accepted to 
follow, it does not yet follow that such social properties need to be irreducible— 
which is what Ruben needs. For it might happen that the social property could 
be given an analysis in terms of non-social properties. And, it can even be 
allowed that there be a long chain of nested analyses such that in the end only 
non-social properties are needed. So, what a theorist arguing for the existence 
of irreducible social properties needs that there be what I would like to call 
‘strongly strong’ social properties satisfying the following condition by Ruben 
(p. 125): 


(*) There are at least some strongly social properties such that no analysis of 
them, however remote, in terms of beliefs and expectations about the 
instantiation of only weakly social properties and non-social properties can be 
adequate. 

To show that his thesis (*) is correct Ruben has to find at least one such 
strongly social property, and claims that in the mayor-example (*) will indeed 
be satisfled (p. 120). But, there is very little discussion and argumentation to 
substantiate this, and, although the example is suggestive, an individualist can 
well remain unconvinced. The matter is, unfortunately, to a great extent left 
hanging in the air. (Note, by the way, that a liberal individualist can surely 
survive the truth of (+); cf. below.) My conclusion concerning Ruben's three 


2 Ruben has informed me in discussion that his intention in the book is not to defend ontological 
realism and that he is willing to grant that his views there be compatible with a nominalist 
reading. Given this, the present paragraph cannot really be taken as a criticism of Ruben's 
views, although it should be pointed out that Ruben nowhere says that in the book. 


N 


Ruben and the Metaphysics of the Social World 269 


arguments for showing the existence of irreducible social properties is that 
none of them is at least waterproof. 


3 INDIVIDUALISM AND EXPLANATION 


3.1 In the last chapter of his book Ruben discusses individualism as a thesis 
about explanation. He calls this doctrine methodological individualism. (Note 
that ‘methodological individualism’ refers to a mixed bag of views and the 
explanatory doctrine does not exhaust it.) In any case Ruben means by his 
methodological individualism roughly the view that ‘ultimately, everything 
that happens or occurs can be explained without recourse to social entitles or 
social properties’ (p. 131). 

I shall not explicate Ruben’s technical notions in great detail, but we need a 
few characterizations to be able to understand the criticisms that follow. 
Ruben takes the relata of the explanatory relation to be facts, and he is mainly 
concerned with Hempelian nomological (especially deductive-nomological) 
explanation. He does not, however, commit himself to that view. As my 
criticisms will not really depend on any particular view of explanation, I shall 
not discuss the matter in this note. We need to say a little about facts, however. 
Ruben takes a fact to be a particular fact if it has or entails something with this 
form: object o or token event e is P; and it is a social fact if either P is a social 
property or the object o or the event e is a social object or social event token. A 
fact is general if it has or entails something with this form: there is some object 
or token event of type of kind P; and it is a general social fact if P is a social 
property. A factis a universal fact if it has or entails something of this form: any 
object or token event which is P is Q; and it is a universal social fact if either P or 
Q or both is a soclal property. Now we can talk about facts as both explananda 
and explanantia, and take the relation between them to be e.g. a deductive 
explanatory relation. 

Ruben's first attempt to specify methodological individualism (MI) goes in 
terms of explanatory chains which have the property that, by explaining one's 
explanantia further, one should ultimately be able to find non-social or social 
fact-free explanations. MI is accordingly taken to entail this (with e 
representing an explanation, p its explanans and q its explanandum): 


(1) Either (a) Both p and q are about non-social facts. 
Or (b) q is about a social fact, but p is about a non-social fact. 


Or (c) pis about a social fact (whether or not q is about a social fact), and there 
is another explanation e’ such that q is the explanandum in e’ and there is 
another explanans of q in e', r, and r is about a non-social fact. 


Or (d) p ts about a social fact (whether or not q is about a social fact), and there 


270 Raimo Tuomela 


is another explanation e’ such that p is the explanandum in e’ and there is an 
explanans of p in e’, r, and r is about a non-social fact. 


Or (e) pis about a social fact (whether or not q about a social fact), and there is 
an explanatory chain, in which the facts mentioned by p and q figure, such 
that some explanatory ancestor of the fact mentioned by p is a non-social fact. 


(1) says that, in any explanation, either the explanans-entity 1s a non-social 
fact, or, in case it is a social fact, either it is replaceable by a non-social fact, or it 
is itself explainable by means of a non-social fact, or it is an explanatory- 
descendant of a non-social fact. So (1) insures that by going backwards in 
every explanatory chain, one will find some point at which the explanatory 
chain has become free of any social facts. But (1) nevertheless allows for the 
following type of chain, where s means social and n non-social fact and where 
the arrow indicates the explanatory relation: 


...84—5n2-s3-—n1-s2-sl. 


This means that (1) does not guarantee that once we have found a non-social 
explanans no social explanantia will ever pop up when going leftwards in the 
chain. The individualist wants proper asymmetry. So Ruben proposes that the 
following two conditions be accepted: 


(2) If an explanatory chain is finitely long, it is sufficient, to insure the 
explanatory priority of the non-social, that the first member of the chain be a 
non-social fact. 


(3) If an explanatory chain is infinitely long backwards, then it is sufficlent, to 
insure the explanatory priority of the non-social, that there be some point in 
the chain with only non-social facts and such that no facts in the chain prior to 
that point (there will be an infinite number of them), are social facts. 


Now (2) and (3) can be summed up by: 


(4) For every explanatory chain, there is some point in the chain at which 
there are only non-social facts and such that either there are no facts prior to 
that point at all, or, if there are, no facts prior to that point are soctal facts. 


Condition (4) will now replace (1), and it seems to be an adequate condition for 
an explanatory methodological individualist to accept. , 

Condition (4) can be further explicated to give more specific theses by 
interpreting 'non-soclal' in different ways. Ruben considers the following 
possibilities for what non-social facts could be: 


(e) psychological facts about the beliefs, desires, etc., of agents; 
(f) psychological facts about agents and materlal facts about the world; 
(g) (only) material facts about the world. 


Ruben and the Metaphysics of the Social World 271 


Ruben calls (e) and (f) psychological versions of MI. He considers (f) in more 
detail and claims to be able to refute it. Let us examine the issue. 

A methodological individualist can accept that explanatory psychological 
states can sometimes be socially conditioned and socially explainable. So what 
is it that the methodological individualist asserts and his opponent denies? 
Ruben plausibly suggests the following specification of (4) to the psychological 
case: 


(4^) for every explanatory chain which includes at least one social fact, there is 
some point in the chain at which the facts are only psychological (and 
material) facts, and such that either there are no facts prior to that point at all, 
or, if there are, no facts prior to that point are social facts. 


Ruben regards (4^) as utterly false. His argument builds on the supposed fact 
that people do have rationally held true beliefs about social matters. If a person 
rationally holds the true belief that a society is matrilineal, then the fact that 
that society is matrilineal can be taken to explain that belief, Ruben assumes. 
The fact that the society is matrilineal explains the person's belief that it is 
matrilineal and not the other way around—as Ruben thinks the methodologi- 
cal individualist must have it. But then (4’) becomes falsified if at any point in 
the explanatory chain people can have (and perhaps must be taken to have) 
such beliefs. Is Ruben right? 
Let us go into some more detail. Ruben claims (on p. 168) that 


the methodological individualist will hold: that the psychological states required 
by (4^) will have explanations; that cognitive states like belief will be included 
among them; and that the content of these cognitive states will be describable 
only by using descriptions of social properties or social objects or both. 


He then claims that subject to these three assumptions (4’) commits the 
methodological individualist to: 


(5) The beliefs required by (4’) are either untrue, or not rationally held. 


But Ruben's claim about the implication (or commitment) does not hold at 
least on his premises. (4") can be consistently joined with the negation of (5), I 
claim, and so Ruben cannot at any rate get a commitment based on 
entallment. Let us note that (5) becomes false if the methodological 
individualist uses rationally held true beliefs in his explanantta, for then both 
disjuncts in (5) become false. And I clatm that that is possible. The possibility of 
negating (5) simply cannot be used against the methodological individualist in 
this discussion without additional assumptions. 

Ruben focuses his discussion on showing that both the disjuncts of (5) are 
false and that therefore the whole thesis is false (and hence unacceptable). The 
first disjunct is argued by him to be false as follows. He assumes that the 
explanatory beliefs (or other cognitive states) are to be explained by their very 


272 Raimo Tuomela 


contents (construed as social states of affairs). Now we get the result that the 
beliefs (cognitive states) in question must be true, for false explanantia do not 
give proper explanations. I will here accept this with the remark that this does 
not yet require the social beliefs be explained by the social entities they are 
assumed by the holist to be about. But, furthermore, he claims that people 
cannot in the long run continue to operate with non-rationally held beliefs. He 
does not give a proper argument for this, but I shall accept for our present 
purposes that the best-explaining beliefs in (4’) must in the long run be not 
only true but rationally held. So (5) indeed seems false. But (4') does not 
thereby become falsified, I claim. Why not? One answer is simply and briefly 
that a social belief can be rationally held on the basis of a set of individualistic 
beliefs. For instance, we can surely rationally hold the true belief that a 
corporation made a business deal on the basis of some individualistically 
acceptable bellefs about certain persons' doing something relevant in the right 
circumstances. And we can rationally hold this kind of beliefs independently of 
whether social matters are somehow reducible to individual matters— 
individual matters can surely be good evidence for the social. 

To illustrate the above matter further, consider the following parallel 
explanations taken up by Ruben. We supposedly want to accept both of the 
following, analogous theses: 


(6) In normal cases, part of the explanation of why persons rationally hold the 
true belief that they are sensing something round is the fact that there is 
something round nearby. 


(7) In normal cases, part of the explanation of why persons rationally hold the 
true belief that some societies are matrilineal is the fact that some societies are 
matrilineal. 


But Ruben claims that the methodological individuals cannot accept (7). The 
contradiction is supposed to be just the mentioned one that (7) commits us to 
the use of social explanantia (contents of beliefs) as more fundamental than the 
beliefs in question. But, the methodological individualist can say the following 
two additional things here (over and above the above point that individualisti- 
cally acceptable beliefs can make social beliefs rationally held). First, if he is 
also—as is typical—an ontological individualist he can say that the analogy is 
not a good one, for round things really exist while societies do not. The second 
response is that irrespective of whether societies and their properties really 
exist they can be individualistically described. And such individualistic 
descriptions may employ social predicates—as long as those predicates are 
interpreted as concepts or something else acceptable to the individualist (see 
my remarks at the end of Section II above). One can be a methodological (or 
explanatory) individualist if Ruben's discussed criterion («) (on p. 125 of the 
book) comes out as false. But, one can also be a methodological individualist in 


Ruben and the Metaphysics of the Social World 273 


a weaker sense even if (+) happened to be true. That is a sense in which 
individuals, their states and social interrelationships are used for best 
explanations in social science but nevertheless no supraindividual social 
wholes are admitted in one's best explanantta.? 

Ruben also discusses materialistic individualism in bypassing but I shall not 
here go into that. My conclusion about Ruben's criticisms of methodological 
individualism is that he has not presented strong criticisms against it—except 
possibly against some of its simplistic versions. Nor has he presented very 
successful arguments against all versions of ontological individualism, not 
even against those he explicitly considered in the book. But he has succeeded in 
formulating many central questions of dispute in a sharp way and he has also 
presented interesting arguments worthy of further elaboration and discussion. 
In spite of my above criticisms, I would like to emphasize that Ruben's book is 
without doubt an important contribution to the philosophy of the social 
sclences.* 


RAIMO TUOMELA 
Department of Philosophy 
University of Helsinki 
Helsinki, Finland 


3 In correspondence Ruben has pointed out that the existence of irreducible properties is 
compatible with the nonexistence of social objects. Accepting this and the point of note (2), 
Ruben's general position becomes compatible with liberal individualism in my above sense, 
while my first critical point about his account of explanation still counts as a criticism. 

*I wish to thank Dr. Ruben for sharp and informative comments on this critical note. His 
cooperative criticisms and suggestions helped to clarify several points In the paper. 


REFERENCES 


Corr, D. [1984]: ‘What Collectives Are: Agency, Individualism and Legal Theory’, 
Dialogue XXIII, pp. 249-69. 

GisBARD, A. [1975]: ‘Contingent Identity’, Journal of Philosophical Logic, 4, pp. 187- 
221. 

Rosen, D.-H. [1985]: The Metaphysics of the Social World, RKP, London. 

Sticu, S. [1983]: From Folk Psychology to Cognitive Science, MIT Press, Cambridge. 

TUOMELA, R. [1985]: Sclence, Action, and Reality, Reidel, Dordrecht. 

TUOMELA, R. [1987]: 'Collective Action, Supervenience, and Constitution', paper 
submitted to Synthese. 


Brit. J. Phil. Sci. 40 (1989), 275-285 Printed in Great Britain 


DISCUSSION 


Zande Logic and Western Logic 


ABSTRACT 


In this paper I discuss logic from a naturalist point of view, characterizing it as those 
shared patterns of thought which are socially selected from among the various patterns 
of thought to which we are naturally inclined. Drawing on Evans-Pritchard’s 
anthropology, I discuss a particular example of Zande thought. I argue that Evans- 
Pritchard’s and Timm Triplett's analyses of this example make the mistake of applying 
Western logic to Zande beliefs and thus find a contradiction. I argue that from the 
naturalistic point of view, Zande logic is different from Western logic and that there is no 
contradiction in Zande thought. 


1 Introduction 
2 Logic as Psychology and Logic as Institutions 
3 The Example of Zande Thought 
3.1 The Apparent Contradiction 
3.2 Triplett's Analysis 
3.3 Evans-Pritchard's Analysis 
4 The Naturalistic Analysis of Logic 


I INTRODUCTION 


In 1937 E. E. Evans-Pritchard published Witchcraft, Oracles and Magic Among 
the Azande. This work is one of the standard sources in the philosophical 
discussion of rationality and relativism. In particular it provides one of the 
three examples David Bloor uses to illustrate the negotiated character of logical 
principles.! Recently in this journal Timm Triplett attacked Bloor's use of this 
particular example and concluded that: 


There is nothing here that is illogical about the Azande's reasoning. It is, in fact, 
quite Aristotelian ...Far from vindicating Bloor, the example of the Azande 
gives support to the idea of a universal logic that is, even if not articulated or 


! This example appers tn chapter 7 of Bloor's [1976]: ‘Negottation tn Logical and Mathematical 
Thought’. Because of the currency that this example has gained in discussions of rationality, it is 
probably worth making the point that ‘Zande’ ts the singular noun and adjectival form of the 
word and that 'Azande' ts the plural noun. 


276 Richard C. Jennings 


explicitly studied, adhered to in all cultures in their practical reasoning. ([1988], 
p. 364-5) 


The particular example that is at issue concerns the Zande concept of 
witchcraft and the substance which the Azande suppose to be associated with 
witchcraft. Bloor argues that on the basis of several of their beliefs, the Azande 
should, according to Western logic, draw a certain conclusion. But, he argues, 
they do not, and therefore concludes that they have a different logic. Triplett 
replies that the Azande ‘are simply reasoning according to the method of 
reductio ad absurdum' ([1988], p. 364). Since they find the conclusion 
unacceptable, he claims, they think there must be something wrong with, and 
therefore reject, one or more of the premises. Thus, Triplett concludes that the 
Azande use the same logic as we do. 

My purpose in this discussion is three-fold. First, I will discuss Bloor's 
analysis of Zande thought in the light of his distinction between logic as it 
relates to the psychology of reasoning and logic as it relates to the institutional 
framework of thought. My aim in this discussion is to clear up Triplett’s 
confusion over a distinction that is basic to Bloor's analysis. Then, secondly, I 
will look in detail at the actual example that is under discussion and I will show 
how Triplett misses the force of Bloor's argument because he has failed to give 
an accurate account of Zande thought as it was reported by Evans-Pritchard. 
Finally, I will discuss Bloor's naturalistic analysis of contradiction and explain 
why, in these terms, there is no contradiction in the Azande's beliefs. 


2 LOGIC AS PSYCHOLOGY AND LOGIC AS INSTITUTIONS 


Turning first to the distinction between logic as psychology and logic as 
institution, we find Triplett confessing his confusion over the following remark 
that Bloor makes about the nature of Zande logic: 


[T]he Azande have the same psychology as us but radically different institutions. 
If we relate logic to the psychology of reasoning we shall be inclined to say that 
they have the same logic; !f we relate logic more closely to the institutional 
framework of thought then we shall incline to the view that the two cultures 
have different logics. ([1976], pp. 129-30) 


Triplett finds it ‘unclear what it ts to “relate logic to the psychology of 
reasoning" or to “the institutional framework of thought’’’ ([1988], p. 365). 
But this distinction is crucial to Bloor's analysis and needs therefore to be 
clarified. 

The basic consideration that needs to be taken into account is that Bloor's 
analysis of knowledge is a naturalistic analysis. For Bloor, what counts as 
knowledge is what is collectively agreed upon: whether it ts moral truths, 
scientific truths or the truths of mathematics and logic, knowledge 1s a social 
phenomenon, it is a part of culture. And what different groups accept as 


Zande Logic and Western Logic 277 


knowledge is to be explained in naturalistic terms, in terms of the psychological 
and sociological nature of people. Whatever kind of knowledge it is, and 
whether we accept it or not, its acceptance is to be studied as a natural 
phenomenon. One implicit assumption that Bloor makes is that people are 
born with more or less similar psychological makeups. But given that different 
groups of individuals, starting with the same psychological makeup, wind up 
with such varied systems of belief, such different bodies of knowledge, the 
question arises: how does this happen? And it is the answer to this aceon 
which concerns sociologists of knowledge. 

In saying that the Azande have the same psychology as us, Bloor 1s saying 
that they are born with the same psychological potentials as we are—their 
minds, initially, work the same way as ours initially did. What is different is 
that they grew up in the midst of Zande culture and institutions whereas we 
grew up amidst our Western culture and institutions. And it is this culture and 
these institutions that shape our thinking and behavior, that make us Zande or 
Western. When Bloor speaks of relating logic to psychology he means relating 
it to our natural proclivities of thought. Following Barry Barnes we can call 
this natural tendency of thought natural rationality. Natural rationality is 
intended to include all the lines of thought our minds are naturally inclined to 
move along. but as anyone who has taught introductory logic, or elementary 
mathematics, knows, not all natural proclivities of thought are equally 
acceptable. Initially there are many directions in which our thought naturally 
proceeds, but not all of these directions of thought can be maintained at the 
same time because they lead in conflicting directions. Some can be maintained, 
but only at the expense of others. Our natural rationality must, in some way, 
be tailored to suit our needs. 

Bloor’s position is that it is the social framework into which we are born that 
provides the suitable structure, that we learn through the process of 
culturation which lines of thought are acceptable and which are not. It is this 
latter aspect of thinking that Bloor refers to when he speaks of relating logic to 
the institutional framework of thought. And it is to the former, to natural 
rationality, that he refers when he speaks of relating logic to the psychology of 
reasoning. Now it should be clear what it means to say that both 
psychological and institutional factors are involved in reasoning. In the first 
instance we are born with various and diverse proclivities to infer; but these 
are then regimented and stabilized through the process of education and 
culturation. Immediately following the passage which Triplett found confus- 
ing and unclear, Bloor explains this distinction as follows: 


Our natural proclivities to infer, like our natural proclivities in all other 


2 Thus we can see that when Triplett speaks of ‘society's structures, as well as the individual 
psychologies instilled in its members’ ([1988], p. 365) he is using ‘psychology’ in a different 
sense from that in which Bloor uses it. For Bloor, our psychology is not instilled in us, it is 
something with which we begin life. 


278 Richard C. Jennings 


directions, do not in themselves form an ordered and stable system. Some 
impersonal structure 1s needed to draw boundaries and to allocate each tendency 
to a sphere deemed proper to tt. Because there is no natural state of equilibrium 
one line of inference will as surely come into conflict with another as one appetite 
or desire will with another. ([1976], p. 130) 


Surely this remark goes a long way toward clarifying the distinction I have just 
explained. 

Insofar as people do have the same psychology—insofar as they have the 
same natural proclivities of thought—we can, with Triplett, find support for 
'the idea of a universal logic that is, even if not articulated or explicitly studied, 
adhered to in all cultures in their practical reasoning’ ([1988], p. 365). But this 
would not be the articulated Aristotelian (or proposttional, or quantificational) 
logic of the West but, rather, the natural rationality, or proclivity to thought, 
that we, as people, share. 

How this natural rationality is regimented and, in some cases, articulated or 
codified, is dependent on the cultural institutions in which the tndividual’s 
education takes place. The fact is, of course, that not all societies codify their 
logic—Evans-Pritchard tells us nothing about Zande reflections on what, in 
general, do or don’t count as good reasons for various claims. Thus, for the 
Azande, all we can do is look at their shared practices in order to discover the 
institutionalized patterns of their thought. This will tell us about the normative 
constraints on their thought, i.e., their logic. This is a methodological point on 
which Bloor and Triplett apparently agree.? The difference between Bloor and 
Triplett is over whether the Azande share our institutionalized patterns of 
thought, whether (in this sense) they practice the same logic as we do. And it is 
to this issue that I now turn. 


3 THE EXAMPLE OF ZANDE THOUGHT 
3.x The Apparent Contradiction 


The example of Zande thought with which this discussion is concerned can be 
constructed from various Zande claims which Evans-Pritchard presents in his 
book. He there quotes C. R. Lagae as saying that: ' 


If a man has witchcraft-substance in his belly and begets a male child, this 
child also has witchcraft because his father was a witch. . . . Thus witchcraft does 
not trouble a person born free from it by entering into him. ([1937], p. 23) 


? When we justify arguments we normally refer to rules of logic—articulated statements of what 
linguistic moves are allowed or not allowed. The Azande have no such articulated set of 
statements, they have no ‘rules of logic’. In this sense they do not have a logic at all. But tf the 
argument is over whether the Azande have the same logic as we do, as it is between Triplett and 
Bloor, then we cannot be talking of such articulated 'rules of logic'. Thus, the Issue must be over 
how the Azande think, over the description of what they actually do. And certainly this is what 
Triplett is talking about. In claiming that the Azande reject a premise of an argument he is not 
talking about explicit articulated rules to which they appeal but, rather, about what they 
actually do. 


Zande Logic and Western Logic 279 


If this is the case, then all male descendants of a male witch would be witches, 
as would be his father, his father’s father, etc. and their male descendants as 
well. Evans-Pritchard notes that: 


To our minds it appears evident that if a man is proven a witch the whole of his 
clan are ipso facto witches, since the Zande clan is a group of persons related 
biologically to one another through the male line. ([1937], p. 24) 


Evans-Pritchard says of this argument that: 


Azande see the sense of this argument but they do not accept its conclusions, and 
it would involve the whole notion of witchcraft in contradiction were they to do 
so. ([1937], p. 24) 


The contradiction would result from the fact that the Azande do not belleve of 
any clan that everyone in itis a witch. They do feel of some clans that they tend 
more to witchcraft, but they do not conclude from this that all members even of 
these clans are witches: 


[C]ertain clans, especially the Abukunde and the Avundua clans, had a 
reputation for witchcraft in the reign of King Gbudwe. In Gangura’s province this 
reputation clung to the Abóka and Abanzuma clans. No one thinks any worse of 
a man if he is a member of one of these clans. ([1937], p. 25) 


Thus it appears that there is a contradiction in the Zande way of thinking. 
Their bellefs about witchcraft substance and its heritability lead to the 
conclusion that every man in a male witch's clan is a witch, but they do not 
accept this conclusion. 

We can give the argument in more formal terms as follows:* 


(1’) All and only witches have witchcraft substance. 

(2) Witchcraft-substance is always inherited by the same-sexed children of 
a witch. 

(3) The Zande clan is a group of persons related biologically to one another 
through the male line. 

(4) Man A of clan C is a witch. 


(5) Everyman in clan C is a witch. 


The Zande system of beliefs includes all of these premises, but the Azande do 
not accept the conclusion. 


* What follows is a near replica of Triplett’s own second version of the argument. The main 
difference is that I stick closer to Evans-Pritchard’s own account of the clan as being related 
through the male line. I have not considered Triplett's first version of the argument, using the 
premise '(1) Every witch has witchcraft substance’, because that premise was a result of his 
confusion about the relation between being a witch and having witchcraft substance. Bloor 
makes this relation quite clear and, were it not clear from reading Bloor, it would be clear from 
reading Evans-Pritchard. 


280 Richard C. Jennings 


I am concerned with three analyses which are given for this apparent 
inconsistency, or contradiction. The first analysis is Triplett's—he maintains 
that there is a contradiction, that the Azande see it, and that in response to the 
contradiction the Azande reject one of the premises of the argument. He 
categorically states that: 


They revise thelr beliefs so that they are consistent. ((1988], p. 364) 


He then uses this analysis of what the Azande do to support his claim that they 
have the same logic as we do. The second analysis is that of Evans-Pritchard— 
according to him there is a contradiction, but the Azande do not see it. His 
effort is put into explaining how it is that the Azande avoid seeing the 
contradiction. And the third analysis is Bloor's. Bloor maintains that since the 
Azande do not revise their beliefs even after having this apparent contradiction 
pointed out to them, there is not, for them, a contradiction; and therefore they 
have a different logic to ours. 

In the remainder of this section I discuss Triplett's position and show how it 
can be strengthened on the basis of further information that Evans-Pritchard 
gives us about the Azande, but which Bloor does not report. Then I show on 
the basis of further information from Evans-Pritchard how this even stronger 
argument does not succeed. I show that Triplett is simply wrong about how the 
Azande respond to the argument formalized above. In the final section I discuss 
Bloor's position. 


3.2. Triplett's Analysis 
Triplett argues that the Azande reason as follow: 


Since the conclusion that all the members of the clan are witches is unacceptable, 
there must be something wrong with one or more of the premises. This is in fact 
precisely how the Azande respond according to Bloor's account: They reply that 
sometimes the witchcraft-substance is ‘cool’ and does not lead to its possesor 
being a witch, f.e., they deny premise 1’. ([1988], p. 364) 


By this account, the Azande respond to a conclusion that they are unwilling to 
accept by rejecting one of the premises of the argument, that is, they apply the 
method of reductio ad absurdum to the argument and, in this case, reject premise 
(1^. But of course there are other premises which could be questioned. And 
indeed Evans-Pritchard reports that other premises are questioned. He tells us 
that: 


In practice [the Azande] regard only close paternal kinsmen of a known witch as 
witches. It is only in theory that they extend the imputation to all a witch's 
clansmen. ([1937], p. 24) 


That is, in practice, insofar as they are willing to act on such a conclusion as 
(5), they are only willing to extend the conclusion to the close paternal 
kinsmen of witch A, i.e., they deny the universality of premise (2). Moreover, in 


Zande Logic and Western Logic 281 


any particular case they have the option of questioning premise (4), and they 
do this. According to Evans-Pritchard: 


They admit that the man is a witch but deny that he is a member of their clan. 
They say he was a bastard, for among Azande a man is always of the clan of his 
genitor and not of his pater... . [T]hey may support this contention by quoting 
cases where members of their kin have been shown by autopsy to have been free 
from witchcraft. It is unlikely that other people will accept this plea, but they are 
not asked either to accept it or reject it. ([1937], pp. 24-5) 


Triplet claimed that: 


The line the Azande take is that the witchcraft-substance does not invariably 
confer ‘witchhood’ on the possessor of that substance. ([(1988], p. 363) 


We now find that there are two additional ways in which the Azande could be 
seen as avoiding the conclusion by the reductio method. 


3.3 Evans-Pritchard’s Analysis 


But even this stronger version of Triplett’s argument is short-lived, because, as 
Evans-Pritchard tells us, the Azande simply do not see a contradiction. He 
states that: 


Azande do not perceive the contradiction as we perceive it because they have 
no theoretical interest in the subject. ... ([1937], p. 25) 


Triplett proposes the following thought experiment: 


Suppose now that a Western anthropologist enters the scene and draws 
attention to the fact that their premises seem to implicate all of a clan C as 
witches. Does the fact that the Azande resist this conclusion show that they have 
an alternative logic, or that they reject logic as we see it? Not at all. Since the 
conclusion that all the members of the clan are witches is unacceptable, there 
must be sometbing wrong with one or more of the premises. ([1988], p. 364) 


Now we do not have to suppose that a Western anthropologist enters the scene 
and draws attention to such possible conclusions, because Evans-Pritchard 
was a Western anthropologist, did enter the scene and did draw attention to the 
consequences of the Zande premises. What he reports is that the Azande see 
the sense of this argument but they do not accept the conclusions, he tells us 
that the Azande do not perceive the contradiction as we perceive it. He does not 
tell us that the Azande revise their beliefs so that they are consistent. Rather, 
his concern is with how they manage to avoid seeing the contradiction. Clearly, 
if the Azande gave up any of their premises, if they used the reductio argument 
as Triplett suggests they do, then this would not be a problem for Evans- 
Pritchard. Evans-Pritchard sees a contradiction but also recognizes that the 
Azande do not see a contradiction, that the Azande just accept the premises and 
do not accept the conclusion. He is concerned to account for the fact that the 


282 Richard C. Jennings 


Azande live with a logical inconsistency, he is at pains to explain how the 
Azande have institutionalized a logical mistake. 

It is certainly true according to Evans-Pritchard that the Azande will 
respond to intimations of such undesirable consequences of their beliefs in 
ways that support Triplett's position. But these are individual responses, not 
socially instituted responses—they do not constitute Zande logic but are, 
rather, natural responses of individuals—they are manifestations of natural 
rationality. I concede that in these cases something like a reductio argument is 
in the minds of those who make the reply. But these replies are not 
institutionalized, and they do not result in a revision of belief—they do not 
represent features of established Zande thought, of Zande logic. They are, 
rather, temporary elaborations of belief offered as an alternative to a different 
but unacceptable Western elaboration of Zande belief. If any of the temporary 
elaborations were generally accepted, and the appropriate revisions of belief 
were instituted, then that would constitute a change in Zande logic. And then 
Zande logic would, in this respect at least, approximate more closely to our 
own Western logic. But from what Evans-Pritchard tells us it is clear that the 
Azande do not so revise their beliefs and so, in this respect, their logic differs 
from ours. It is for this reason that he is concerned to explain how the Azande 
can live with a logical inconsistency. 

The way in which he accounts for the fact that the Azande live with a logical 
inconsistency is by distinguishing what they believe and think in practice from 
the theoretical implications of their beliefs. This distinction appears in his 
observation that: 


In practice they regard only close paternal kinsmen of a known witch as witches. 
It is only in theory that they extend the imputation to all a witch's clansmen. 
([1937], p. 24) 


And appears again when he says that: 


[W]e might reason that if a man be found by post-mortem immune from 
witchcraft-substance all his clan must also be immune, but Azande do not act as 
though they were of this opinton. ([1937], p. 24) 


The Azande have vartous beliefs which, for us, have certain consequences. 
But, for the Azande, they do not have these consequences. Evans-Pritchard 
accounts for this by distinguishing their practice, or actions, from the 
theoretical implications of their beliefs. He then argues that the Azande have 
no interest in the theoretical implications of their beliefs—that their concern is 
only with particular acts and circumstances, not with general conditions of 
individuals or clans. He states that: 


Azande do not perceive the contradiction as we perceive it because they have 
no theoretical interest in the subject, and those situations in which they express 
their beliefs in witchcraft do not force the problem upon them. . . . One attempts 


Zande Logic and Western Logic 283 


to discover whether a man 1s bewitching some one in particular circumstances 
and not whether he ts born a witch. ... A Zande is Interested in witchcraft only 
as an agent on definite occasions and in relation to his own interests, and not as a 
permanent condition of individuals.... Azande are interested solely in the 
dynamics of witchcraft in particular situations. ((1937], pp. 25-6) 


It is evident from Evans-Pritchard's own report of Zande thought that they do 
not revise their beliefs so that they are consistent in the way that Triplett claims 
they do. They do not revise their beliefs because they do not perceive a 
contradiction, and they do not perceive a contradiction because they have no 
theoretical interest in the subject. 

This is the account that Evans-Pritchard gives and it is accurately reported 
by David Bloor. Triplett has failed to give an accurate account of Zande 
thought as reported by Evans-Pritchard. As a result of this he misses the force 
of Bloor's argument. On the basis of an accurate report we see that Triplett is 
simply wrong in what he says the Azande do. The Azande do not revise their 
beliefs so that they are consistent (consistent in the Aristotelian sense). 


4 THE NATURALISTIC ANALYSIS OF LOGIC 


In [1976] Bloor's concern is mainly to show that logic poses no threat to 
institutions, that its potency is only derived from the institutions in which it is 
routinely accepted. He maintains that where logic seems to threaten the 
institution it can be met by another logic. In particular, with respect to the 
Azande institutions: 


Logic poses no threat to the institution of witchcraft, for one piece of logic can 
always be met by another. Not even this 1s necessary unless someone uses the 
inference in order to pose a threat, and if they do, It 1s the user not the logic that is 
the threat. ([1976], p. 126) 


It is in support of this claim that Bloor produces the example of the Zande 
response in terms of 'cool' witchcraft substance. And, it must be admitted, this 
looks like a suggestion that the Azande have instituted the use of a reductio ad 
absurdum argument to avoid the undesirable conclusion. But, as Ihave shown, 
this reading is wrong. The Azande do not revise their beliefs. To be sure, it is 
possible to reconstruct some of their beliefs in the form of an argument whose 
conclusion contradicts one of their other beliefs. But the whole point of Evans- 
Pritchard's ethnography is that this is not an argument that they use—it is not 
a line of thought that is included in their shared practices. On this basis, Bloor 
argues that for the Azande there is no contradiction. My final concern will be to 
present this argument. 

In Wittegenstein: A Social Theory of Knowledge [1983] Bloor offers an account 
of contradiction which is naturalistic—an account in terms of which the 
apparent logical force that is attached to contradictions is shown to derive from 


284 Richard C. Jennings 


their practical role. For Bloor, the point is that it is not the contradiction that 
results in the unacceptability, it is, rather, the unacceptability that results in 
the contradictoriness. Bloor appeals to Wittgenstein’s idea that use determines 
meaning. For Bloor and Wittgenstein the issue of contradictions is handled 
naturalistically, that is, the difficulties with contradictions are seen to arise not 
from the logic of contradictions but from the practices in which contradictions 
are involved. The general position is that meaning is derived from use, and the 
specific position with respect to contradictions is that they are unacceptable 
because of the problems in practice to which they give rise. Bloor, following 
Wittgenstein, tells us: 


We should not explain the lack of use of ‘p and not p’, when conjoined, by 
appeal to the meaning of negation. The meaning of negation does not determine 
the use; the use determines its meaning. ([1983], p. 122) 


And with respect to the practical difficulties of contradictions: 


The practical difficulties that surround the utterance of contradictions are not to 
be explained by logic. Rather, their logical properties are to be grounded in these 
practical problems. . . . ([1983], p. 122) 


The difficulty in being told that it is raining and it is not raining is not that it is a 
contradiction but that we cannot both take and not take the umbrella. 

Now as Evans-Pritchard has told us, for the Azande there is no difflculty in 
practice. And so, on Bloor's account, there is no contradiction. Moreover, 
Zande practices are not such as to create difficulties. Evans-Pritchard tells us 
that: 


A man never asks the oracles, which alone are capable of disclosing the location 
of witchcraft-substance tn the living, whether a certain man is a witch. He 
asks whether at the moment this man is bewitching him. One attempts to 
discover whether a man is bewitching some one in particular circumstances and 
not whether he is born a witch. ((1937]. p. 25) 


Thus the Azande are in no danger of reaching the unacceptable conclusion 
that worrles Triplett. For Evans-Pritchard this is why the Azande fail to see the 
contradiction. For Bloor this is why there is no contradiction. For Evans- 
Pritchard there ts a contradiction because he has Western practices that do not 
fit with the Azande practices—he elaborates Zande beliefs along lines which 
are not natural for the Azande. Evans-Pritchard and Triplett suggest the 
Azande ought to draw from their beliefs a general conclusion which would 
result in practices incompatible with other of their practices—which, in 
particular, would result in their regarding everyone in a whole tribe as a witch. 
But this is not an acceptable attitude and the Azande do not in practice so* 
elaborate their beliefs. They see the possibility of this elaboration, and they also 
see ways of deflecting such elaboration. But these ways are not to be construed 
as an exercise in logic by the Azande, they are not an institutionalized or 


Zande Logic and Western Logic 285 


shared practice” but, rather, are merely the operation of their natural 
rationality. When presented with a Western elaboration of their beliefs they 
respond in terms of their individual proclivities of thought. Both the Western 
elaboration and their responses fall outside the normal patterns of their 


- thought— these elaborations are not part of the collective thought practices, 


the logic, of the Azande. For the Azande, with their commitment to the 
premises and the denial of the conclusion, there is no reason to take on this 
western elaboratton of their beliefs. The point is that they simply do not pursue 
this line of elaboration, and thus, from a naturalistic point of view, it 1s 
legitimate to say that there is no contradiction. 


RICHARD C. JENNINGS 
Department of History and Philosophy of Science 
University of Cambridge 


REFERENCES 


BARNES S. B. [1976]: ‘Natural Rationality: A Neglected Concept in the Social Sciences’, 
Philosophy of Social Science, 6, pp. 15-26. 

Broon, D. [1976]: Knowledge and Social Imagery. London: Routledge and Kegan Paul. 

Broor, D. [1983]: Wittgenstein: A Social Theory of Knowledge. London: The Macmillan 
Press Ltd. 

EVANS-PRITCHARD, E.E. [1937]: Witchcraft, Oracles and Magic Among the Azande. Oxford: 
Clarendon Press. 

Trier, T. [1988] ‘Azande Logic Versus Western Logic?’, The British Journal for the 
Philosophy of Science, 39, pp. 361-366. 


Brit. J. Phil. Scl. 40 (1989), 287-288 Printed in Great Britain 


DISCUSSION 


Some Comments Concerning Spin 
and Relativity 


Several recent articles in this journal have discussed the relationship between 
spin and special relativity. Morrison [1986] suggests that relativity is 
necessary for a proper understanding of spin, while Von Peschke [1987] 
disagrees, citing the following remark of Hestenes and Gurtler [1971] with 
approval: ‘The Dirac theory takes account of relativity, but contrary to a 
widespread opinion, it says nothing fundamental about spin that is not already 
in the Pauli theory.’ However, both of these accounts are to some extent 
misleading. The correct understanding of the relationship of spin to relativity 
follows from the classic paper by Wigner [1939] on the Poincare group. 

In this paper Wigner studies the subgroups of the Poincare group that leave 
the four momentum of a free particle invariant. For a massive particle, with a 
timelike four momentum, the subgroup (or little group) is the rotation group— 
we can see this by looking in the particles rest frame. From this it can be shown 
that under an arbitrary proper Lorentz transformation, the state of a massive 
particle transforms as a representation of the rotation group. Therefore, in 
relativistic, as well as nonrelativistic quantum mechanics, the allowed values 
of the spin of a massive particle label the representations of the rotation group. 
In this sense, relativity does not offer any new understanding of spin. So here I 
side with Von Peschke. 

However, the little group for a massless particle is not the rotation group (for 
example, a massless particle has no rest frame), but one locally isomorphic to 
E(2), the group of rotations and translattons in the Euclidean plane. Speaking 
heuristically, only rotations about the direction of motton will leave the four 
momentum of a massless particle invariant. The group of rotations in the plane 
has only one dimensional representations, and so a massless particle has a 
single spin state or helicity. That is, unless the poincare group is extended by 
parity. Then there are two helicity states, one parallel, the other anti parallel to 
the direction of motion.! 

This fact, that massless particles have at most two spin or helicity states is 
something that is genuinely relativistic and cannot be understood within the 
framework of nonrelativistic quantum mechanics. So here I side with 
Morrison's suggestion. 


1 For an in depth development of the methods of Wigner [1939], see Kim and Noz [1986]. 


288 Robert Weingard 
In fact, consider the Dirac equation in the so called chiral representation, 


^63 09) e 
the o! are the Pauli matrices and 1 is the 2 x 2 Identity. Then when the mass 
m = 0 the Dirac equation, 
("Pu + m)y — 0, 
decouples into the two equations, 


o: P 

p^- —W 
o- P 

p, I7 | 


Vr and we are the two eigenstates of the helicity operator, ¢ -P/Po, and they are 
related by the parity operator P, Pj/; — ys. If parity is not conserved we thus 
have a massless Dirac particle with a single spin state. Contrary to the quote 
from Hestenes and Gurtler [1971], the Dirac equation says something here 
that is fundamental about spin that is not already in the Paull theory. 


ROBERT WEINGARD 
Department of Philosophy 

The State University of New Jersey 
New Jersey, USA 


REFERENCES 


HzsrENES, D. and GURTLER, R. [1971]: Am. J. Phys., 39, p. 1028. 

Km, Y. S. and Noz, M. E. [1986]: Theory and Applications of the Poincare Group. D. Reidel. 
Mornisom, M. [1986]: Brit. J. Phil. Sci., 37, p. 101. 

Von PzScHkE, J. [1987]: Brit. J. Phil. Sci., 38, p. 566. 

WIGNER, E. P. [1939]: Ann. Math., 40, p. 149. 


NEW FROM OXFORD 


The Invented Universe 

The Einstein-De Sitter Controversy (1916—17) and the Rise of 
Relativistic Cosmology 

Pierre Kerszberg 


Einstein was led to cosmology after a series of discussions with the Dutch astronomer 
Willem De Sitter. This study shows how and why the relativistic way of thinking has made 
such a unique contribution to modern philosophy of nature. 


0 19 851876 5, 416 pp., illus., Clarendon Press, June 1989 £55.00 


An Introduction to the Philosophy of Science 
Anthony O'Hear 


À balanced and up-to-date introduction to the subject, covering all the main topics in the 
area, and introducing the student to the moral and social reality of science. 


0 19 824814 8, 248 pp., Clarendon Press, March 1989 £22.50 
0 19 824815 X, paperback, March 1989 £6.05 


The Description of Nature 
Niels Bohr and the Philosophy of Quantum Physics 


John Honner 


Bohr was as original a philosopher as he was a physicist. This study explores several 
dimensions of his vision. 


0 19 824976 4, 244 pp., illus., Clarendon Press, 1988 £25.00 
An Introduction to the Philosophy of Induction 
and Probability 

L. Jonathan Cohen 


An investigation on a non-specialist level of two problems about the gradation of certainty: 
the evaluation of inductive reasoning, whether in science, jurisprudence, or elsewhere; 
and the interpretation of the mathematical calculus of chance. 

0 19 875079 X, 228 pp., Clarendon Press, February 1989 £97.50 
0 19 875078 1, paperback, February 1989 £8.95 


The Limits of Science 


Peter Medawar 

‘Medawar, with his incredible command of language and ability to clarify issues, defines the 
limits of science separating what happens within ips bee what must remain beyond in 

the fields of metaphysics, inative literature and religion.’ 

Journal of the Institute of Health Education. 

0 19 285048 1, 128 pp., illus., Oxford Paperbacks, Oxford Paperbacks, 1986 £4.95 


Publishers of the second edition of 
The Oxford English Dictionary 





RATIO (New Series) 


Edited by Edward Craig 


in June 1988 Basil Blackwell started publishing Ratio (New Series), a 
journal as continuous as possible with the journal formerly entitlted 
Ratio. The journal publishes high quality work on a wide variety of topics. 
The attempt to further co-operation and understanding between 
philosophers who work primarily in German and philosophers who work 
primarily in English remains an Important feature of editorial policy. 


Recent articles include: 


The Crypto-Metaphysic of ‘Ultimate Causes’ 
ANDREAS DORSCHEL 


Making up your Mind: Self-interpretation and Self-Constitution 
RICHARD MORAN 


The Unifying Role of Symmetry Principles in Particle Physics 
BRIGITTE FALKENBURG 


no ttmen nen as wermnaananenaanaanannreeunne ane aa nna ene rcee nnn anennnnnennennanencen enn taen a T ——— - 


Ratio (New Series) is published in June and December 
Subscription Rates Volume 2, 1989 

Individuals: £20.25 (UK) £24.75 (overseas) US$47.50 (N. America) 
Institutions: £31.00 (UK) £38.75 (overseas) US$68.00 (N. America) 


C Please enter my subscription to Ratio (New Series)/send me a 
sample copy 
C] L enclose cheque/money order made payable to Basil Blackwell 


C] Please charge my Access/American Express/Barclaycard/Diners 
Club/Mastercharge/Visa account number: 


Card expiry date 


C] For payments via the National Giro Bank, the Basil Blackwell account 
number is 236 6053 


Please retum this form together with your payment if applicable to: 


Basil Blackwell - Joumals Marketing Manager, 
- 108 Cowley Road, Oxford OX4 1JF, UK or Journals Marketing Manager, 
432 Park Avenue South, New York, NY 10016, USA. 





THE PHILOSOPHICAL 
QUARTERLY 


Editorial Chairman: Neil Cooper 

Executive Editor: Roger Squires 

Founded by the Scots Philosophical Club and the University of St Andrews In 
1950, The Philosophical Quarterty alms to foster and publish significant 
contributions In every branch of the subject, promoting discussion of recent 


philosophical works, and providing critical reviews of recent books, in a style 
accessible to as wide a group of philosophers as possible. 


Recent articles include: 


The Mind of God and the Works of Man by E. Craig 
R. HARRISON 


Kant and the Claims of Knowledge by P. Guyer 
A AXIOTIS 


Appearance and Reality by P. Hacker 
D. STERN 


The Ratlonality of Emotion by R. De Sousa 
A. MELE 


———————————————MÓ————À —>— ————— 

The Philosophical Quarterly is published in April, July, October for the Scots 

Philosophical Club and the University of St Andrews 

Subscription Rates Volume 37, 1988 

Individuals: £13.00 (UK) £16.00 (overseas) US$28.50 (N. America) 

Institutions: £33.00 (UK) £44.00 (overseas) US$75.50 (N. America) 

C] Please enter my subscription to The Philosophical Quarterly/send me a 
sample copy 


(J tl enclose cheque/money order made payable to Basil Blackwell 


L] Please charge my Access/American Express/Barclaycard/Diners Club/ 
Mastercharge/Visa account number: 


Please return this form together with your payment if applicable to: 


Basil Blackwell - Joumais Marketing Manager, 
108 Cowley Road, Oxford OX4 1JF, UK or Journals Marketing Manager, 
432 Park Avenue South, New York, NY 10016, USA. 





metaphysics 


a philosophical quarterly 


ISSN 0034-6632 
JUNE 1989 | VOL. XLII, No.4 | ISSUE No. 168 | $9.00 


articles 

critical study 

books received 
philosophical abstracts 
announcements 


index 


ANDREW VINCENT Can Groups Be Persons? 
LUDO PEFEROEN The Heterogeneity of Thinking 
JAMES F. ROSS The Crash of Modal Metaphysics 
CHARLES E. SCOTT The Middle Voice of Metaphysics 


JOSEPH M. ZYCINSKI The Doctrine of Substance and 
Whitehead's Metaphysics 


BRUCE MORITO Fundamental Ontology and Personal Identity: 
A Critique of Albert Shalom's View of Personhood 


MARK M. HURLEY AND STAFF Summaries and Comments 


Individual Subscriptions $23 00 Institutional Subscnpton $35.00 Student/Retired Subscnptions $12.00 


Jude P. Dougherty, Editor 
The Catholic University of America, Washington, D.C. 20064 








NOTRE DAME 
JOURNAL OF FORMAL LOGIC 


Editors 
Michael Detlefsen Mark E. Nadel 
John P. Burgess Anand Pillay 
William W. Tait 


$99 8 o9 o9 9 ot t t os od o? o9 9 9 on o» o 9 9 o9 9 o9 9 o. n t 9 o9 o 9 o9 s on 9 t 9 t] | 9] n9] n 5 


The Journal aims to provide a common ground where 
both philosophers and mathematicians can read and publish 
significant work on all areas of philosophical and mathema- 
tical logic; philosophy of language and formal semantics for 
natural languages; and the philosophy, history, and founda- 
tions of logic and mathematics. 

Upcoming contributions include: 


Michael Dummett .......... More on thoughts 
Albert Visser .............. Peano’s smart children: 
A provability logical study 
of systems with built-in 
consistency 
Johan van Benthem ......... Notes on modal definability 
Fred Sommers ............. Predication in the logic 
of terms 
John Hawthorn ............ Natural deduction in normal 
modal logic 
Editorial correspondence should be addressed to: 
The Editors 
PO. Box 5 
Notre Dame, Indiana 46556 
U.S.A. 


Business correspondence should be addressed to the 
business manager, Jan Houseman. 


Subscription Rates: The Journal is published quarterly 
(January, April, July, and October) with annual subscriptions 
being $40 for institutions and $20 for individuals. 

All back issues are currently available. 


HISTORY AND PHILOSOPHY 
OF LOGIC 


EDITOR 
I, Grattan-Guinness 


Middlesex Polytechnic at Enfield H istory, nd 


Philosophy 
Logic" of 


Middlesex EN3 4SF, UK 


HISTORY AND PHILOSOPHY OF LOGIC is PGS 
devoted to the study of the historical development 
and broader philosophical concerns of logic. The 
journal is primarily concerned with general 
philosophical questions on logic: existential and 
ontological aspects, the relationship between classical and non-classical logics, and the 
connections between logic and other fields of knowledge such as mathematics, physics, 
philosophy of science, epistemology, linguistics, psychology and latterly computing. The 
journal contains a large book review section, and carries special features on manuscript 
collections, and on projects in progress. 





A Selection of Recent Contents 

Hobbes's Logic: Language and Scientific Method, W. R. de Jong * What are Logical 
Notions? A. Tarski, edited by J. Corcoran * Peano as Logician, W. V. O. Quine * Frege's 
Theory of Real Numbers, P. M. Simons * On the Philosophical Foundations of Free 
Description Theory, K. Lambert * Beltrami's Model and the Independence of the Parallel 
Postulate, M. J. Scanlan * The Development of Probability Logic from Leibniz to MacColl, 
T. Hailperin. 


Publisher: Taylor & Francis 
Subscription Information 
Volume 10 (1989) Published twice a year ISSN 0144-5340 
Institutional Rate: US$95 £52 
SEND FOR A FREE SPECIMEN COPY 


TAYLOR & FRANCIS 


UK: Rankine Road, Basingstoke, Hants RG24 OPR 
USA: 242 Cherry Street, Philadelphia PA 19106-1906 © 














OxFORD?* UNIVERSITY’ PRESS 


NEW IN PAPERBACK 


Incompleteness, Nonlocality, and Realism 
A Prolegomenon to the Philosophy of Quantum Mechanics 


MICHAEL REDHEAD 
WINNER OF THE 1988 LAKATOS AWARD FOR AN OUTSTANDING 


CONTRIBUTION TO THE PHILOSOPHY OF SCIENCE 


‘Michael Redhead has groan Me'a le Ls 
Canadian Philosophical Review 


see of the new philosophy of physics . . . long-awaited, authoritative and splendid book.’ Philosophical 


Clarendon Paperbacks 
Anal ae 7, 208 pages, paper covers, Clarendon Press £9.95 
0 dos 824937 3, 208 pages, Clarendon Press £25.00 


Philosophy, Psychiatry and Neuroscience— Three 


Approaches to the Mind 
ynthetic Analysis of the Varieties of Human Experience 


EDWARD M. HUNDERT 


"This book proposes a new, unified view of the mind which integrates the insights of philosophers, 
psychologists, and neuroscientists. 


0 19 824796 6, 368 pages, Clarendon Press £30.00 
May 1989 


Laws and Symmetry 


BAS C. VAN FRAASSEN 


The author analyses and rejects the arguments that there are laws of nature, or that we must believe 
that there are, arguing that we should discard the idea of law as an inadequate clue to science. 


0 19 824860 1, 340 pages, paper covers, Clarendon Press 10.95 
019 oe 3 340 pages, Clarendon Pres 13250 
June 1 


Nature's Capacities and their Measurement 


NANCY CARTWRIGHT 


Since Hume, empiricists have barred powers and capacities from nature. But this book argues that 
capacities are essential in our scientific world, and that, contrary to empiricist orthodoxy, they can comply 
with strict standards of testability. 


0 19 En 0, 360 pages, Clarendon Press £27.50 
June 1989 


For more information please contact Anne Kitson, Oxford University Press, Walton Street, Oxford OX2 6DP 


Oxford University Press, Publishers of the Second Edition of the 
Oxford English Dictionary 





GRAZER 
PHILOSOPHISCHE STUDIEN 


BAND 32 1988 VOLUME 32 
Aufsátze ON THE BACKGROUND OF ‘Articles 
CONTEMPORARY PHILOSOPHICAL LOGIC 
Edited by Risto Hilpinen 


The History of Early Computer Switch A. W. BURKS & A. R. BURKS 
Peirce on the Indeterminate and on the Jay ZEMAN 
Hypostatic Abstraction in Empirical Science Thomas SHORT 
Meinong on the Foundations.of Deontic Logic Seppo SAJAMA 


Die sogenannte Analytizitit der Mathematik Jaques Pierre DUPUCS 
Das onto-logische Sechseck Wolfgang DEGEN 
Indexikalische Ausdrücke und Propositionen Wolfgang BECKER 


Diskussionen Discussions 
Russell’s Robust Sense of Reality: A Reply to Butchvarov Jan DEJNOZKA 
Russell's Views on Reality Panayot BUTCHVAROV 
The Meinongian-Antimeinongian Dispute Reviewed Stewart UMPHREY 
Replies to Butchvarov and Umphrey Jan DEJNOZKA 
Das Vindizierungsargument funktioniert! Eine Erwiderung auf Piller Gerhard SCHURZ 
Eine Verteidigung des Angriffs auf das Vindizierungsargument Christian PILLER 
On the Propositional Relation Theory of Perception Jong Ho HA 
Contents and Objects of Experience. A Reply to Jong Ho Ha Ernest SOSA 
Das unbestimmte Argument von der Skepsis 


Herausgeber Editor 
Prof. Dr. Rudolf Haller, Institut für Philosophie, Universitat Graz, Heinrichstrafte 26, A-8010 Graz, 
Osterreich/Austria. 


REVUE INTERNATIONALE DE PHILOSOPHIE 


Editor: Michel MEYER 
143, av. A. Buyl, 1050 Brussels, Belgium 


Each number is devoted to a particular movement, a particular philosopher, or 
a particular problem. 


We publish 4 issues annually. Articles are written in English, French, German, 
or Italian. 


Our last issue was devoted to Philosophie des Mathématiques 


Articles by : M. D. Resnik, Mathematics from the Structural Point of View — C. 
Wright, Why Numbers can bellevably be: a Reply to Hartry Field — G. G. 
Granger, Sur l'idée de concept mathématique «naturel» — V. Ya. Perminov, On 
the Reliability of Mathematical Proofs — A. Barabashev, Empiricism as a 
Historical Phenomenon of Philosophy of Mathematics — Analysis and review. 


Coming numbers : Heidegger — Wittgenstein — Freud 





Brit. J. Phil. Sci 40 (1989), 289-306 Printed in Great Britain 


Vectors and Change 


JOHN BIGELOW AND ROBERT PARGETTER 


ABSTRACT 


Vectors, we will argue, are not just mathematical abstractions. They are also 
physical properties—universals. What make them distinctive are the rich and 
' varied essences of these universals, and the complex pattern of internal relations 
which hold amongst them. 


I 


Vectors emerge from the problem of change. A very important special case is 
the problem of motion, motion being conceived as change of place. And 
change has been a problem since the time of ancient Greek science. The trouble 
with change, as far as the Greeks were concerned, arose because the very 
notion of change seems to involve a contradiction. Suppose a discus changes 
from round to square. This entails that the discus is round; and it entails that 
the discus is square—and isn't that contradictory? No, not if we recognize that 
instantiation is tensed. A thing can be round at one time and square at 
another. But is this all there is to the process of change, that a thing has one 
property at one time, and another property at another time? 

In the late Middle Ages a debate developed over this issue.’ Ockham and his 
numerous followers argued that change is nothing more than the possession of 
asequence of different properties at different times. This was called the doctrine 
of ‘changing form’ (forma fluens). The opposition doctrine was (confusingly) 
* called the doctrine of ‘change of form’ ( fluxa formae). It is not easy to glean the 
difference between the doctrines just from their names: what is the difference 
between changing form and change of form? The key lies in the shift between 
‘form’ being the noun in the name of Ockham's doctrine, qualified as 
‘changing’, whereas the name of the opposition doctrine uses ‘change’ itself as 
the noun. The latter doctrine treats change itself as a subject, a characteristic of 
a thing. Thus, for instance, it is held by the ‘change of form’ doctrine that an 
- object which is darkening will not only have a given shade, of say, grey ata 
` given time, it will also have a changing shade of grey. The property of being a 


! For an intriguing exposition and further references, see E. Dijksterhuis [1959]. The Ockhamist 


doctrine has become the orthodoxy among analytical philosophers this century. a aN 
r SA 2. x 
(US menu 
Er ri EY " 





290 John Bigelow and Robert Pargetter 


darkening grey at a time is claimed to a property of a thing at that time, and not 
just reducible to its various properttes at other times. 

The medievals discussed many kinds of change: darkening, cooling, getting 
heavier, and so forth. One which they found especially interesting was the 
change involved in a thing (or a person) getting better. But for science, the 
crucial case was that of change of place—motion. The Ockhamist doctrine of 
‘changing form’ held that motion is no more than just the occupation of 
successive places at successive times. The opposing doctrine of ‘change of form’ 
insisted that a moving body possesses more than just a position, at a given 
time, but also what amounts to an instantaneous velocity. 

Newton's name for the differential calculus was the theory of 'fluxions'— 
and like the medieval doctrine of 'fluxa formae', Newton's theory attributed 
instantaneous velocities to moving bodies. Let us call the anti-Ockhamist view 
the doctrine of flux. 

The doctrine of flux, in essence, attributes a vector to a moving object at a 
time. The Ockhamist view, too, can be seen as employing vectors, but in a 
different sense, a less full-blooded sense. Ockhamists can use the sequence of 
positions of a body to define a vector of an abstract mathematical kind. 
Followers of Newton have done likewise in their interpretation of 'instanta- 
neous velocity'. Using the sequence of positions of a body over time, we can 
define the vector giving its so-called instantaneous velocity. But thus defined, 
the vector is a mathematical abstraction, reflecting a pattern displayed across 
time—but it is not an intrinsic property of a body at a particular time. 

In contrast, the doctrine of flux does attribute a vector to a moving body at a 
time. This vector is not construed as merely a description of where the object 
has been or will be, but as an intrinsic property of an object at a particular 
instant. In addition to the Ockhamist second order property of having various 
positions at various times, there is the first order (perhaps varying) property of 
the object at each time. Consider an object which ts in motion, and consider its 
properties at a given time. At a given time it has a position. It also has relational 
properties: it has a history, and a destiny. Jt has the property of having been 
located in such-and-such places in the recent past, and about to be located in 
other places in the near future. So far, the Ockhamist agrees. The question is 
then whether these properties are all that the instantaneous motion vector 
amounts to. The doctrine of flux denies that it is. There is a further property 
over and above an object’s position, history and destiny. Indeed, the vector is to 
a considerable extent independent of the position, history and destiny. The 
object could have had the same past positions and not have had the vector; and 
conversely, the object might not have had those past positions and yet still 
have had the vector. Similar comments apply for future positions of the object. 

In typical medieval fashion, the flux theorists put their doctrine to 
theological use. Aristotle had argued that beyond the sphere of the stars there 
was nothing at all, not even empty space (indeed, especially not empty space, 


Vectors and Change 291 


that being for Aristotle logically impossible). When combined with the 
Ockhamist theory of motion, Aristotle’s cosmology entailed that the universe 
as a whole could not have any motion (apart of course from its rotational 
motion). God, then, could not have set the universe into rectilinear motion. 
This sounded to some (including the then Pope) as though it unduly restricted 
God’s omnipotence. The doctrine of flux was used to provide a way out. God 
could have given the universe an instantaneous velocity, a vector—even 
though there was nowhere for the universe to go. This is entertainingly silly, 
but it is also instructive, because it highlights the independence between the 
‘flux’ vector and the mere list of ‘form-time pairs’ (as we might characterize the 
Ockhamist view). The supporters of the flux doctrine held that a body could 
possess a motion vector even if It possessed no appropriate sequence of different 
positions at different times. The motion vector was conceived as logically 
Independent of both prior and posterior positions of the body. Furthermore, the 
flux theorists took the motion vector to explain the sequence of positions a 
body occupies, when indeed it does occupy such a sequence. The vectors 
explain the sequence of positions, not vice versa. We will defend this core 
doctrine of the flux theory. The idea that an object could have an 
instantaneous velocity, without any variation of position across time, may 
seem fanciful. But whatever essential links there may be between possession of 
a velocity vector at a time, and positions at other times, we argue that such 
links arise because the velocity vector explains change of position, not because 
it is defined by change of position. 

Although logically independent of each other, the velocity vector and the 
prior and posterior positions of a body will be importantly connected by 
physical laws. The presence of a vector at a time will contribute to an 
explanation of its subsequent positions. The flux doctrine reverses the order of 
explanation implicit in the Ockhamist view. 

For Ockham, the object has a certain velocity just because it has been in 
different places and will be in still further different places. Indeed, its having a 
velocity is not so much explained by its sequence of positions—rather, its 
having a velocity is nothing over and above its having a suitable sequence of 
positions. The having of a position is a first-order property (or predication—if 
you are a nominalist). Velocity is a second-order property (or predication), 
since it refers to or quantifies over a number of distinct first-order properties or 
predications. For Ockham, a second-order pattern over first-order positions at 
times is what constitutes velocity. Hence velocity cannot explain an object's 
sequence of positions, on Ockham's theory: we cannot say an object is now 
located to the right of where it was a moment ago because it was in motion a 
moment ago. According to the flux theory, that gets the explanatory direction 
cart-before-horse. An object will be, say, a little higher a moment from now 
because it is now moving upwards. It will be a little higher because it is now 
moving upwards—and not just because it is here now and was lower a 


292 John Bigelow and Robert Pargetter 


moment ago. To explain why it will move still higher, we need to appeal to 
more than just the fact that it has moved upwards to here. We need to appeal to 
the fact that it is still moving upwards. The first-order properties of position are 
explained by another first-order property of instantaneous velocity. That is to 
say, the flux doctrine introduces a first-order property of tnstantaneous 
velocity in addition to the second-order Ockhamist property, and the flux 
doctrine uses the first-order property to explain the second-order one. 

In order for first-order instantaneous velocity to ‘explain’ second-order 
patterns of positions, there must be the right sort of relationship between first- 
order and second-order properties. On the one hand, there must be some 
degree of logical independence between them, if one is to ‘explain’ the other— 
at least, if explanation is here to be construed as (roughly) scientific rather than 
purely mathematical explanation. Yet on the other hand, the logical 
independence cannot be so exaggerated that there is no relevant connection at 
all between explanans and explandum. What the flux theory requires is the 
logical possibility of first-order instantaneous velocity without second-order 
pattern of positions, and vice versa; but in addition, it requires some relevant 
connection. between first-order and second-order properties. Roughly speak- 
ing, the first-order property must entail some appropriate second-order 
property, but only when supplemented with extra information of some 
appropriate, non-question-begging sort. 

Let us consider the logical independence between instantaneous velocity on 
the one hand, and sequences of positions on the other, In the first place, we 
should note that the possession of an instantaneous velocity, say East to West, 
is compatible with a variety of different sequences of positions. The body may 
have been due East a moment before, and may be due West a moment later. 
Yet it may never have been due East and may never be due West in future. 
Suppose for instance that it is at the highest point of a parabolic trajectory. 
Then {ts instantaneous velocity is horizontal, yet all its prior positions and its 
future positions are lower than its present position. 

The Ockhamist takes this to show that velocity is to be defined not just in 
terms of a finite distance covered in a finite time, but rather, as a mathematical 
limit of a sequence of distance-time pairs. 

The flux theory, however, argues that instantaneous velocity is in principle 
independent of the existence of distance-time pairs which tend to a mathemati- 
cal limit. We should reflect on two cases. There may be instantaneous velocity 
at a time, without any sequence of prior positions generating an appropriate 
mathematical limit—since an object may be jolted into motion instanta- 
neously. For similar reasons, there may be instantaneous velocity at a time, 
without any sequence of future positions generating an appropriate mathema- 
tical limit—since an object in motion may be instantaneously brought to rest, 
say by colliding with another. This is at least a logical possibility (and that is all 
we require for present purposes). Consider a simplified Newtontan world with 


Vectors and Change 293 


two perfectly rigid bodies colliding, one going from velocity V to rest at just the 
moment when the other goes from rest to velocity V. 

Now consider three Newtonian rigid spheres A, B and C. Suppose B and C are 
at rest and are in contact with one another. A moves with velocity V along the 
line joining the centres of B and C, and strikes B. Theory tells us that A will stop, 
B will not budge, and C will move off with velocity V. The velocity of A is 
transferred from A, through B, to C. There is a moment in time when B has 
velocity V—even though there is no appropriate series of past or future 
positions for B which will yield velocity V as a limit. 

This demonstrates that it is possible to have a first-order property of 
instantaneous velocity without any appropriate Ockhamist second-order 
property. Instantaneous velocity does not entail any sequence of positions 
which generates a mathematical limit. 

Conversely also, it may be argued that the existence of an Ockhamist 
sequence of positions does not entail instantaneous velocity in the sense 
required by the doctrine of flux. There could be an Ockhamist sequence of 
positions which was explained not by the presence of instantaneous velocities, 
but rather by some other source. Consider for instance the movement of an 
image projected onto a screen from a movie projector. The image of Clint 
Eastwood's cigar, for instance, may occupy a sequence of positions which 
generates an Ockhamist pattern. Yet the Image, at a given time, does not have 
any instantaneous velocity. According to Malebranche [1688], the whole 
world is a bit like a movie, and God is a bit like the projector. An object is 
created at one location at one time, and is created at another location a 
moment later. At neither time is it really moving—it has no genuine 
instantaneous velocity in the sense required by the doctrine of flux. Its position 
at the later moment is not explained as causally dependent on its position and 
velocity at a previous moment. Rather, its position at the later moment is 
explained solely as causally dependent on the free choice of God. 

In Malebranche's world, the motion of a body is not in fact a genuine causal 
process, in the sense referred to in Einstetn’s special theory of relativity. 
Einstein tells us no object can be accelerated up to and beyond the speed of 
light. But an image can be. For instance, suppose a spot to be projected by a 
lazer onto the surface of say one of Jupiter's moons, say Callisto, and then 
suppose the lazer gun to be jiggled here on Earth. If the angles and distances are 
right, then the spot on Callisto will whizz across the surface of that moon faster 
than the speed of light.? This is no threat to Einsteln's theory, because the 
motion of the spot is not a causal process—and the spot is not an object. The 
spot at one time is not really ‘the same thing’ as the spot at another time. When 
we say ofthe spot that 'it' has moved, this is deceptive. There is really no single 
‘it’ which is in one place at one time and another place at another time. In the 


? Non-causal pseudo-motion of this sort ts discussed by H. Reichenbach [1958]. 


294 John Bigelow and Robert Pargetter 


case of the spot, there is only the Ockhamist kind of velocity—a sequence of 
positions across times. The spot does not occupy the position it does because of 
its earlier position and its earlier instantaneous velocity. Rather, it occupies the 
position it does because of the position of the lazer gun on Earth. For the spot, 
Ockhamist velocity is all there is. Hence the existence of a second-order pattern 
of positions does not entail the existence of first-order instantaneous velocity. 

Newton's three spheres show that instantaneous velocity does not entail an 
Ockhamist sequence of positions. The lazer spot on Callisto shows that an 
Ockhamist sequence of positions does not entail the presence of instantaneous 
velocity. 

This establishes one of the things the flux theory requires: the logical 
independence of first-order from second-order velocity. Yet the flux theorist 
must beware. In order for the first-order velocity to explain the second-order 
velocity, there must be some fairly intimate link between them. 

One plausible link is suggested by the spot on Callisto. The spot lacks 
instantaneous (first-order) velocity. It also lacks genuine numerical identity 
across time—it is not really a single thing which occupies different places at 
different times. Perhaps these issues—velocity and identify—are linked. It may 
be argued that numerical identity across time is dependent on the presence of 
causal links across time. And then it may be argued that an Ockhamist 
sequence of positions plus numerical identify of an object across time does entail 
instantaneous velocity. That is to say, it may be argued that when a sequence 
of positions is occupied by the same object, then the object must have been 
moving if it was to get from one position to the next. f 

This means of linking first and second-order velocities, however, is a dubious 
one. It attempts to draw a ‘metaphysical’ link between first- and second-order 
velocities—a link which is prior to the contingent laws of nature. Male- 
branche’s world, however, stands as a counterexample, if you believe that 
world to be a logical possibility at all. In Malebranche's world, God creates the 
very same object over and over, but in different places at different times. For 
Malebranche, there are patterns of both positions and identities across time, 
yet there are no first-order velocities of the sort required by the doctrine of flux. 
Hence there is no entallment between Ockhamist velocity and flux velocity, 
even if we add the assumption of numerical identity across time. 

For this reason, we argue that the link between Ockhamist and flux 
velocities is provided not by a metaphysically essential link, but rather by the 
contingent laws of nature. Ockhamist sequences of positions are (often) 
explained by instantaneous velocities. The reason is simply that given the laws 
of nature, there is (in most cases) no way of getting an object from one place to 
another except by imparting to it an instantaneous velocity. 

Velocity is not the only vector which plays an explanatory role of the sort we 
have been discussing. Indeed, for purposes of theory, velocity is frequently 
given no separate mention, but is subsumed under momentum, which of 


Vectors and Change 295 


course is also a vector. Force is yet another vector, and so is acceleration. All 
these vectors feature in laws of nature, and it is this which gives them 
explanatory power. Each of these vectors constitutes an intrinsic characteristic 
of a thing at a time. Acceleration, for instance, is a characteristic of a thing ata 
time, and its possession helps to explain the pattern of past and future velocities 
of the thing. The connection between instantaneous acceleration, and a 
pattern of velocities across time, is mediated by their roles in laws of nature. 

There are other quantities which can be defined in physical theory, but 
which do not play any such explanatory role. Thus, for Instance, consider the 
way an object accelerates as it falls toward the Earth. The Galilean law of free 
fall assumed constant acceleration. However, this is not strictly correct. As the 
body moves closer, the gravitational force becomes stronger, and a stronger 
force entails a greater acceleration. Hence the body's acceleration varies with 
time—and its acceleration varies at an accelerating rate. (Indeed, the relevant 
law is an exponential one, and it turns out that the rate of change of 
acceleration equals the rate of change of rate of change of acceleration . . . and 
so ad infinitum.) 

The rate of change of accelleration, however, does not play an explanatory 
role comparable to that of velocity, force, and so forth. There is thus no reason 
to posit an instantaneous vector for rate of change of acceleration, which 
explains past and future accelerations. The Ockhamist pattern of accelerations 
over time is all there is to the 'rate of change of acceleration' vector. The 
pattern of differing accelerations over time is not explained by a ‘flux’ for rate of 
change of acceleration. The pattern is adequately explained by 'flux' vectors of 
a lower order than that one. In the hierarchy of rates of change, only those of 
velocity and acceleration are instantaneous vectors, which explain patterns 
across time, The hierarchy is merely Ockhamistic from there on up. There is no 
good reason for positing flux vectors as well as Ockhamist ones, above the level 
of acceleration—at least not unless higher level rates of change should come to 
figure in newly discovered laws of nature. 

This further illustrates the explanatory link between flux vectors and 
patterns across time. The link 1s not a tight logical or metaphysical one, but a 
looser, nomological one. 


II 


The doctrine of flux, it may be noted, rests comfortably with the Cartesian 
law of inertia: that an object continues with uniform motion in a straight line 
unless acted upon by a force. If we regard instantaneous velocity as a genuine, 
intrinsic property of an object at a time, then there is an important sense in 
which continuing motion is not a change. One of the object's intrinsic 
properties, its velocity vector, is remaining the same. Assuming that every 
change requires a cause, we can infer therefore that motion will only change if 


296 John Bigelow and Robert Pargetter 


there is a cause, a force, which acts on the body. Thus, the flux doctrine implies 
a reversal of the Aristotelian view on motion. Aristotle held that motion was 
change, change of place, and so required continual force to maintain continual 
change of place. The flux doctrine takes change of place to be extrinsic change 
for a body, consequent upon lack of change of an intrinsic property, the velocity 
vector.? 

The flux doctrine has a number of explanatory advantages, at least in the 
case of motion. (For some cases of change, the Ockhamist view is very probably 
quite right. The process of darkening, for instance, ts quite different from that of 
motion. Ockham may be quite right about darkening, cooling, moral 
improvement, and many of the other medieval talking-points.) Consider for 
instance a meteor striking Mars, and consider the problem of explaining why it 
creates a crater of precisely the size that it does. At the precise moment of 
tmpact, the meteor exerts a specific force on the surface of Mars. Why does it 
exert precisely that force? Because it is moving at a particular velocity. On the 
Ockhamist view, what this amounts to is that it exerts the force it does because 
it has occupied such-and-such positions at such-and-such times. In other 
words, the Ockhamist appeals to the positions the meteor has occupied in the 
past. But why should a body’s past positions exert any force now? This requires 
the meteor to have a kind of ‘memory’——what it does to Mars depends not only 
on its current properties, but also on where it has been. In fact, given the fact 
that the meteor is accelerating, the Ockhamist will need a very fancy story 
about how the action on Mars depends on previous positions. The meteor's 
action on Mars depends on the fact that the distance between successive 
positions has been proportional to the square of the time between them. 

For the Ockhamist, then, the impact on Mars will depend on the pattern of 
the sequence of positions prior to impact. The effect of a body at a time thus 
depends for the Ockhamist not only on the intrinsic character of the object at 
the time, but also on its history. This cannot be ruled out as absurd, without 
argument. Nevertheless, it is an advantage of the flux doctine that it does not 
require any pseudo-memory in objects, or any time-lag in causation. The 
meteor exerts a given force, at a moment, because of the property it has at that 
very moment. This property is an instantaneous velocity, a vector, with both 
magnitude and direction. The recent history of the body may be epistemically 
important, furnishing the canonical method of determining what its instanta- 
neous velocity must be. But it is the resulting velocity itself which is causally 
active, not the history which produced it, and revealed it to us. 

Consider another problem case for the Ockhamist: the spinning disc.* 


3 Tt must be stressed here that we are exploiting a realist theory of universals, in which the sharing 
of a property requires more than the applicability of the same description to two different things. 
Sharing a property requires the existence of something which is present in the two things. Not 
every predicate corresponds to a property in that sense. We are presupposing a theory of roughly 
the same sort as D. M. Armstrong's [1978] ‘a posteriori realism’. 

* The problem of the spinning disc has been impressed upon us by work of D. Robinson, some of 
which emerged in his [1982]. 


Vectors and Change 297 


Suppose the disc to be of perfectly uniform shape, and to be made of a perfectly 
uniform substance. Each part of it is qualitatively indistinguishable from each 
other part. Consequently, as it spins, there is no change at all in the distribution 
of qualities over time. As it spins, each part is replaced by another part which is 
qualitatively identical. Yet the spinning disc is quite different from the 
stationary one. If you were to cut a slice out of the spinning disc, the gap would 
spin around; whereas if the disc were not spinning, the gap would not move. 
The two discs differ in their causal powers. What is the difference between the 
discs, which grounds the difference in causal powers? 

One difference between the discs is that even though they exactly match in 
the distribution of qualities over time, they differ in the identities of the portions 
` of matter which instantiate those qualities. In the spinning disc, even though 
the same quality is instantiated in a given place at two thmes; it may be 
instantiated by a different portion of matter. Tracking identity of individual 
portions of matter on the spinning disc, we will trace out circles. Tracking 
identities on the starionary disc, we will not trace out circles—the portion we 
start with will stay in the same place across time. 

This account of the nature of the spinning disc entails a disputed doctrine 
about identity. It requires us to trace the identity of a portion of matter in a way 
that cannot be reduced to a tracing of some bundle of qualities. In short, it 
requires non-qualitative identities. Why should we say a portion of the disc at 
one moment is identical with one portion rather than some other qualitatively 
identical portion? What distinguishes them is simply that one does, and the 
others do not, have a specific identity, or 'thisness', or (using the medieval 
word) 'haeccelty' (for more, see Adams [1979]). 

We are not opposed to non-qualitative identities. We accept that the 
spinning disc does display a circular displacement of identities across time. Yet 
this is not the whole story. Even if you can accept non-qualitative identities, it 
is hard to believe that they generate the ktnds of causal powers which the 
spinning disc possesses. ] 

Itis here that the doctrine of flux can play an explanatory role. The basis for 
the causal powers of the disc is the possession, by each portlon, of an 
instantaneous velocity. It is because of the instantaneous velocities instan- 
tiated at each moment, that the identities of portions of the disc track in 
circular paths. 

For these reasons, we argue that instantaneous velocity, a vector with both 
magnitude and direction, should be construed as a physical property, a 
universal, possessed by a body at a given moment of time. It is an intrinsic 
property, not a property relating the object to other times and places. 

Yet in construing a vector thus, as an intrinsic property at a time, rather 
than a pattern displayed across time, we saddle ourselves with a problem. We 
need to explain what sort of property this is, and how it can be understood as 
having both a magnitude and a direction. Itis not as though a property is llke a 


298 John Bigelow and Robert Pargetter 


pin in a ptncushion, with the length of the pin constituting the magnitude, and 
its orientation constituting its direction. 

By supporting the theory of flux, in fact, we make tt more difficult to account 
for the magnitude and direction of a vector. The Ockhamist theory does have 
an advantage on this score, at least on the face of it. If a velocity vector simply 
reflects a pattern of positions for a body across time, as they suggest, then 
magnitude and direction emerge immediately from the relevant temporally 
extended pattern. The past and future positions of the body lie in determinate 
directions from the present position, and at determinate distances from it. We 
can define instantaneous magnitude and direction for velocity, by taking limits 
of sequences of magnitudes and directions for prior and posterior positions. 

For physical reasons, we have argued, the flux theory should be accepted. 
On this theory, magnitude and direction for velocity cannot be defined in the 
Ockhamist manner. On the flux theory, we may attribute to a body some 
Ockhamist pattern of prior and posterior positions; and we may attribute an 
instantaneous velocity. But the latter attribution is not the same as the former, 
in attributing an instantaneous velocity we are not attributing any pattern of 
prior and posterior positions. So we cannot explain what constitutes the 
magnitude and direction of the velocity by appeal to that pattern. 

How then can we explain what constitutes the magnitude and direction of a 
vector? The trick is to draw upon a theory about the relations among 
properties. 


III 


First, consider two-place relations between velocities. Consider for instance 
two points on the flux theorist's spinning disc. Suppose these points both lie on 
the same radius, so that their instantaneous velocities have the same direction. 
Then these two velocities have something in common. Yet they differ from one 
another too, since the point further from the centre is moving faster than the 
one closer to the centre. The points along a given radius, at a given moment, all 
have something in common, yet they also differ. What they have in common, 
however, is not their velocittes—each has a different velocity. Rather, what 
they have in common is a second-order property: the property of having a 
velocity with such-and-such a direction. 

Consider in contrast the set of points on the disc which are located half-way 
between the rim and the axis of rotation. These points all have the same speed. 
That is to say, they all have a velocity of the same magnitude. But they are all 
moving in different directions. Hence what they have in common is not the 
possession of the same velocity. Rather, what they share is the second-order 
property of having a velocity of such-and-such a magnitude. 

Thus vectors, construed on the flux theory as intrinsic properties, have 
second-degree properties, that is properties of properties. When two vectors 


Vectors and Change 299 


share one of these properties, we say they have the same direction; when they 
share another of these properties, we say they have the same magnitude. If 
they share both magnitude and direction, then ‘they’ are really ‘it’, they are 
identical. This gives us an initial grip on the way that a flux theorist can 
understand the ‘magnitude’ and ‘direction’ of a vector. 

A note on terminology is in order here. A property of a property is said to bea 
second degree property. Velocity, we say, is a property of an individual, whereas 
its magnitude and direction are second-degree properties of velocity. The 
existence of second-degree properties like magnitude and direction generates 
new properties of individuals, as well as properties of properties. Suppose an 
individual has a velocity, and that velocity in turn has a given direction. Then 
not only does the velocity have a property, but the individual too has a 
property as a consequence—the property of having a velocity with that direction. 
This is not a property of a property, so it is not a second-degree property. It is 
called a second-order property. A second-order property is a property of 
Individuals, but one which presupposes or entails the instantiation of a second- 
degree property. 

We are arguing that in the case of vectors, both second-degree and second- 
order properties are genuine universals, in the sense of Armstrong's a posteriori 
realism over universals. Thus, for instance, we argue that two points on the 
same radius of a spinning disc do have something in common. What they have 
in common is not merely that some linguistic item, a description, applies to 
both. There is something non-linguistic present in both, a genuine universal— 
a second-order universal. 

Two velocity vectors then may be the same in magnitude, or may differ in 
magnitude, and the same holds for direction. But of course that is only part of 
the story. Not only can we say whether two velocities are the same or different 
in magnitude (or direction), but also, we can say what manner of difference 
there is. When two velocities differ in magnitude, we need to say whether the 
first is greater than the second, or vice versa, and indeed we need to say how 
much they differ. Similarly, when two velocities differ in direction, we need to 
say how much they differ, whether for instance one is directed only a few 
degrees clockwise (in a glven plane) from the other, or whether it is directed say 
90% clockwise from the other. 

To obtain a framework for understanding such 'degrees of difference', we 
can draw upon ideas which derive from Frege, Whitehead, Wiener and Quine.? 
The central idea we borrow from them is that comparisons of ‘degrees’ of 
difference derive from specific relations betwéen relations. 


3 See G. Frege [1903]; A. N. Whitehead and B. Russell [1910-13], especially vol. 3, pt. VI, 

^: N. Wiener [1912-14], W. V. Quine [1941], also Set Theory and Its Logic [1969]; and J. 

Bigelow [1988], Reality of Numbers. The Frege-to-Quine theory of real numbers has been applied 
to scalar, as opposed to vector magnitudes, in R. Pargetter and J, Bigelow [1988]. 


300 John Bigelow and Robert Pargetter 


Consider a relation, Q say. Suppose this relation holds between one 
individual xo and another x, say: 


XoQx:. 
Suppose it also holds between x; and some further individual, x; say: 
xi Q x. 


Furthermore, suppose there is a sequence of individuals, each one standing in 
relation Q to the next: 


xoQxi0x40... Q Xn 


When this chain of Q-relations holds, we may say that you can get from xo to x, 
by n Q-steps. Whenever there exists some sequence of individuals enabling you 
to get from xo to x, by n Q-steps, we may then abbreviate this assertion by the 
formula: 


Xo Q"x,. 


Itis convenient to extend the notation, in order to allow for negative numbers 
as 'indices' for the relation, thus allowing for such things as: 


xoQ^?x;. 


This is to be understood as asserting, not that you can get from xo to x2, but that 
you can get to xo from x; by 2 Q-steps: 


For some x1, *2Q xı and xı Q xo. 


In general, ‘xQ-"y’ will be equivalent to ‘y Q^x'. 

With this notation to call upon, let us compare two distinct relations, Q and 
R say. A variety of possible relationships may emerge between Q and R. It may 
be for instance that whenever two things are linked by a Q-chain: 


xQ’, 
then there is never an R-chain which links these things: 
For all m, not: xR”y. 


When this is so, it constitutes a relationship of a sort between Q and R, a degree 
of independence between them. This relationship is, however, fairly unspecific, 
it is a negative relationship—a lack of more specific, more intimate relations. It 
is like the relation which holds between say a velocity and a volume, rather 
than the relation between one velocity and another. 

In contrast, consider a case in which, whenever two individuals are linked 
by a chain of Q-steps, then those individuals are also always linked by a chain 
of R-steps. When this is so, it constitutes an important relationship between Q 
and R. 


Vectors and Change 301 


Consider a special case. Suppose that Q and R are such that, for any 
individuals x and y: 


xQ!y  ifandonlyif xR-ly. 


Then an important relation holds between Q and R—one is the inverse of the 
other. 

Consider another special case. Suppose that Q and R are such that, for any 
individuals x and y: 


xQly  ifandonlyif for some n, xR’y. 


Then an important relation holds between Q and R—the former is the 
transitive closure, or ancestral, of the latter. 

There are infinitely many other relationships that can hold between two 
relations Q and R. Here are some examples. Q and R may be such that, for any x 
and y: 


xQly  ifandonlyif — xR?y 
or: 

xQly  ifandonlyif xRy 
or: 

xQ*y ifandonlyif — xRy 


and so on. 

As Quine has noted, the first of these, x Q!y if and only if x R?y, holds in the 
case where Q is the grandparent relation, and R is the parent relation. If x and y 
are linked by one grandparent relation, then they are linked in two steps by the 
parent relation. In contrast, the condition, x Q!y if and only if x R3y, is not 
satisfied where Q is the grandparent and R the parent relation. It is, however, 
satisfied where Q is the great-grandparent relation and R is the parent relation. 

For present purposes, there are more important relations which satisfy the 
various conditions illustrated above. Let Q be the relation which holds between 
two individuals just when one ts travelling 2 m/s faster than another. Let R be 
the relation which holds when one is travelling 1 m/s faster than the other. For 
this Q and R, the condition is satisfied that for any x and y, x Q!y if and only if 
xR?y. In a similar way, we can use ‘relative velocities’ to instantiate any of the 
conditions of the form: 


xQ'y ifandonlyif — xR"y 


These conditions require that there are enough relative velocities to serve as 
appropriate stepping-stones between one relative velocity and another. We 
trust that there are indeed enough relative velocities to serve. Whenever an 
object accelerates, it passes through continuum-many velocities, so all the 


302 John Bigelow and Robert Pargetter 


velocities we need are in fact instantiated, we claim. Even if they are not 
instantiated in the actual world, that would not torpedo our theory. Parting 
company from one of the core doctrines of Armstrong's version of a posteriori 
realism about universals, we advocate a Platonist doctrine: that various 
physical properties and relations exist, whether or not any individuals 
instantiate them in the actual world. We thus assert that, whether or not 
anything actually has had all the required velocities, nevertheless there are 
enough relative velocities to make it the case that: 


xQ"y if and only if xR"y, 


for appropriate relations of relative velocity, Q and R. 

Non-Platonists may have more difficulty accepting the existence of all the 
stepping-stones required by conditions of this form. It is open to the faint- 
hearted to retreat here to a merely modal claim: that there could be sufficiently 
many things to yield the pattern required for the condition relating Q to R. 
However, we suggest this modal fact does require some categorical basis for its 
truth. There has to be some genuine relation of proportion between Q and R, in 
order for the modal condition to hold. Indeed there has to be such a relation in 
order for Q and R to instantiate the required pattern either possibly or actually. 
We maintain that there is a genuine relation of proportion between universals 
such as those of velocity, and this relation is what explains the patterns which 
emerge of the form 


xQ'u ffandonlyif — xR"y 


Thus, we maintain, for any such condition there are relations of relative 
velocity—whether instantiated or not, actual or merely possible—which 
satisfy that condition. 

The converse, however, does not hold. It is not the case that for each pair of 
relative velocities, there is a condition of the above form which they 
instantiate. Consider the case where Q is the relation of being 1 m/s faster and 
Risthat of being x m/s faster. Then since the proportion between 1 and z is not 
a rational number, it follows that no condition of the form 


xQ'y if and only if x R"y, 


which is satisfied by this Q and R. 

How then are we to explain the relation between being 1 m/s faster, and 
being z m/s faster? By employing the technique of ‘Dedekind cuts’. We can 
divide all the conditions of the form: 


x Q"y if and only if x R"y, 


into two classes—a 'cut'. 
The relation we are interested in (the one involving the irrational 
proportion) will fail to satisfy any of the conditions, on either side of the cut. 


Vectors and Change 303 


Nevertheless, if the cut is constructed in the right way, there will be a definable 
sense in which the z-relation falls ‘between’ the conditions on one side of the 
cut, and those on the other side of the cut. The details need not detain us here. 
The important point is that even irrational proportions such as that between 1 
m/s faster, and x m/s faster, can be defined using conditions of the form 


xQ'y  ifandonlyif — xR"y. 


Irrational proportions satisfy none of these conditions, but are characterized by 
being in a definable sense 'greater than' all those which fall on one side of a 
given cut and ‘less than’ all those on the other side of that same cut. 

Techniques such as these can be used to describe infinitely many relations 
among relations. These can appropriately be called proportions among 
relations. The definitions of all of these proportions employ straightforward 
generalizations of the standard definitions of such relations as coextensiveness, 
inverse, transitive closure, and so forth. 


IV 


However, we are left with a problem when we try to bring this framework to 
bear on physical vectors like velocity. The definition of proportions applies to 
relations, not properties. Yet we have argued for the flux theory, according to 
which a vector, like velocity, is a property rather than a relation. To explain the 
nature of a vector, we need somehow to bridge the gap between relations which 
stand in deflnable proportions, and the properties which we have argued 
vectors must be. 

The bridge we need is generated by an important correlation between 
properties and relations. Whenever two individuals have the same property, 
this constitutes the holding of a relation between the individuals—the 
individuals match in a certain respect. Schematically, when Fx and Fy, then 
there is a relation, Rp say, such that x Rgy. 

More generally, even when two individuals have distinct properties, this (at 
least sometimes) constitutes another sort of relation between them. When Fx 
and Gy then there exists a relation, Ryo say, such that x Regy. So we claim. For 
present purposes we do not need to rely on the fully general claim, that any 
properties whatever, F and G, generate a relation Rpg. We need only rely on the 
claim that when F and G are vectors of the same sort, then they generate a 
relation Rgg. 

Let us return to the spinning disc. Restrict our attention to points whose 
velocity is in the same direction—points lying along the same radius of the 
disc. Each of these points has a different velocity. But because each has the 
velocity it does, there will be relations between them. The ones further from the 
centre will be moving faster than those closer in. More specifically, ones further 
out will be a determinate amount faster. One will be 1 m/s faster than another. 
Yet another will be x m/s faster. And so on. 


304 John Bigelow and Robert Pargetter 


In particular, we may compare each point on the radius with the point at the 
centre which is not moving at all. A point x with such-and-such velocity will be 
moving a specific amount faster than the central point y. Because the point x 
has the property it does, it will stand in a specific relation to y—it will be 
moving ‘so-much faster than’ y. 

Thus, the properties of points on the radius, their velocities, all generate 
relations of relative velocity. These relations stand in proportions to one 
another, defined according to the Frege- Whitehead-Wiener-Quine theory. 
Then the properties, velocities, stand in a proportion to one another if and only 
ifthe corresponding relations, relative velocities, stand in that proportion. Thus 
for instance, the velocity we call 2 m/s is twice that which we call 1 m/s 
because the corresponding relation, 2 m/s faster than, is twice the relation, 1 
m/s faster than. Hence, we can bring the Frege-to-Quine theory to bear on 
velocities, in virtue of the natural correspondence between velocities, and 
relative velocities. 

Velocities have both magnitude and direction. We have focussed so far on 
differences in magnitude, keeping direction constant. We can make compari- 
sons of this sort by drawing our subject matter from the points along one given 
radius of a spinning disc. 

Now let us consider differences in direction, keeping magnitude constant. 
Consider a collection of points all of which are the same distance from the 
centre of the spinning disc. These points all have the same speed, but different 
directions. In virtue of their different directions, they will stand in determinate 
relations. In particular, one will be 90? clockwise from another, others will be 
45? anticlockwise from yet others, and so on. A relation of 'so-many degrees 
clockwise from' will hold between velocities; but in consequence of this, a 
similar relation will hold between the things which have those velocities. Thus 
for instance, points equally distant from the centre of the disc will be related to 
one another in distinctive ways: one will be moving so-many degrees 
clockwise from another. These relations stand in proportions to one another, 
by straightforward application of the Frege-to-Quine theory. 

Arbitrarily select one of the points on the circle we are considering: its 
velocity will have some determinate direction—say East. Then each other 
velocity around the circle will be oriented some specific number of degrees 
clockwise from the designated velocity. Hence each of the velocities around the 
circle can be paired with a specific relation. We can then say that two velocities 
stand in a determinate proportion, with respect to direction. Such proportion 
holds between velocities just in case that proportion holds between the 
correlated relations. We can thus apply the Frege-to-Quine theory of 
proportions, by associating each property with a relation, and then defining 
proportions between such relations. - 

Let us take stock, then, of the relations we have described among velocities. 
Take two points x and y along the same radius of the spinning disc. Their 


Vectors and Change 305 


velocities will stand in a specific proportion with regard to magnitude. In virtue 
of this, we may say that x and y are related in a distinctive way to one another: 
say that x has n times the magnitude of the velocity of y. Abbreviate this as: 


xPiy. 


Now compare y with a point z which is the same distance as y from the centre 
of the disc. There will be a distinctive relation between y and z in virtue of y’s 
possession of a velocity so-many degrees clockwise from that of z. Abbreviate 
this as: 


y Paz. 
Put together the two relations: 

xPiy 

y P2z, 


and we obtain a derivative relation between x and z. Call this derivative 
relation P+. We may define P» by saying: 


xPaz 1f and only if for some y, x Piy and y Paz, 


where P; and P; are the specific magnitude and direction proportions described 
above. 

On the spinning disc, any two points will be related by a 'two-step' 
proportion of the same form as P». This is so in virtue ofthe intrinsic properties, 
the instantaneous velocities, of each of the points on the disc. These properties 
of instantaneous velocity count as vectors because they stand to one another in 
a family of ‘two-step’ proportions, of the form of Px. 

The 'two-step' proportions displayed among velocities can be generalized, to 
*n-step' proportions. This permits a physical grounding, along the lines of the 
flux theory, for vectors of a more general sort. Mathematically, a vector like 
velocity is represented by an ordered pair of real numbers. Generalizing: all 
ordered n-tuples of real numbers are said to represent vectors. These more 
general vectors may be physically grounded, by generalizing from 'two-step' 
relations of proportion, to ‘n-step’ relations of proportion. That is to say, some of 
these mathematical ‘vectors’ may correspond to vectors in the sense required 
by the doctrine of flux. Not all of the mathematical n-tuples will correspond to 
vectors of the flux sort. Some may correspond only to the Ockhamist sort of 
second-order properties of things. Mathematically, it is useful to amalgamate 
several different flux-vectors which characterize an individual, and represent 
them as a single n-tuple which represents the individual's state. Usually these 
n-tuples will not correspond to any single flux-property over and above the 
aggregate of separate flux-properties which generated them. This will be useful 
for various purposes. However, in some cases physical theory might postulate 
that there is first-order flux property as well, which will then play a distinctive 
explanatory role in the theory. 


306 John Bigelow and Robert Pargetter 


Vectors and vector spaces, as abstract mathematical machinery, are 
exceedingly useful in physical science. The reason they are so useful is that 
physical objects have various intrinsic properties which are aptly called 
'vectors'—1n a different sense from that of abstract pure mathematics. Some 
but not all of the mathematical 'vectors' correspond to such intrinsic physical 
properties. What makes it appropriate to call these properties 'vectors'? The 
answer is: they are aptly called 'vectors' because of the rich network of 
relations of proportion into which they are embedded. These properties are 
related to one another in these ways, and furthermore, it 1s essential to the 
natures of these properties that they enter into just those relationships. If two 
of these properties had not stood in the proportion in which they do stand to 
one another, then they would not be the properties that they are. 

An explanatory science requires the attribution of such properties as 
instantaneous velocities, which are vectors. These should be recognized as 
genuine physical properties, not just as useful jargon or abstract mathematical 
structures. They are as independent of language and mind as are the moving 
bodies which instantiate them. Furthermore, these properties are related to 
one another in a very tight pattern. In virtue of this pattern of interrelations, 
they stand in proportions to one another. And that is why real numbers are so 
useful in sclence. 


REFERENCES 


Apams, R. M. [1979]: ‘Primitive Thisness and Primitive Identity'. The Journal of 
Philosophy, 76, pp. 5-26. 

ARMSTRONG, D. M. [1978]: A Theory of Universals. 2 Volumes. Cambridge University 
Press. 

BiceLow, J. C. [1988]: The Reality of Numbers. Clarendon. 

BiGELOW, J. C. and PARGETTER, R. J. [1988]: ‘Quantities’, Philosophical Studies, 54, pp. 
287-304. 

DiyksrERHUIS, E. [1959]: The Mechanization of the World Picture. Trans. C. Dikshoorn, 
Clarendon, 1961; reprinted, Princeton University Press, 1986. 

Fruce, G. [1903]: Grundgesetze der Arithmetik. Volume II. Hermann Pohle. 

MALEBRANCHE, N. [1688] Dialogues on Metaphysics and on Religion. Trans. M. Ginsberg, 
1923. George Allen and Unwin. 

Quine, W. V. [1941]: ‘Whitehead and The Rise of Modern Logic’ in The Philosophy of 
Alfred North Whitehead, ed. P. A. Schlipp. Open Court. 

Quine, W. V. [1969]: Set Theory and Its Logic. Revised Edition. Harvard University Press. 

REICHENBACH, H. [1958]: The Philosophy of Space and Time. Dover. 

Rosinson, D. [1982]: ‘Re-identifying Matter’. The Philosophical Review, 91, pp. 317- 
341. 

WHITEHEAD, A. N. and Russet, B. A. W. [1910-13]: Principia Mathematica. Cambridge 
University Press. 

WIENER, N. [1912-14]: ‘A Simplification of the Logic of Relations’. Proceedings of the 
Cambridge Philosophical Society, 17, pp. 387-390. 


Brit. J. PhiL Sci. 40 (1989), 307-322 Printed in Great Britain 


The Metamathematics—Popperian 
Epistemology Connection and its 
Relation to the Logic of 
Turing's Programme* 


JEAN-ROCH BEAUSOLEIL 


ABSTRACT 


Turing's programme, the idea that Intelligence can be modelled computationally, ts set 
in the context of a parallel between certain elements from metamathematics and 
Popper's schema for the evolution of knowledge. The parallel is developed at both the 
formallevel, whereit hinges on the recursive structuring of Popper's schema, and at the 
contentual level, where a few key issues common to both epistemology and 
metamathematics are briefly discussed. In light of this connection Popper's principle of 
transference, akin to Turing's belief in the relevance of the theory of computation for 
modelling psychological functions, is widened into the extended principle of transfer- 
ence. Thus Turing's programme gains a solid epistemological footing. 


1 Introduction 
2 PS & RS: The Formal Connection 
3 Contentual Connections 
3.1 'Self'-reference 
3.2 Objectivity 
3.3 Limits 
4 The Extended Principle of Transference 
5 Concluding Remark 


I INTRODUCTION 


An important feature of Turing’s paper on ‘Computing Machinery and 
Intelligence’ (Turing [1950]) is that it stands against the background of à 
series of remarkable results in metamathematics (cf. Kleene [1952], Hao 
[1962], and Barwise [1982]) to which he has eminently contributed 
(especially in Turing [1936]). His well known claim that computing 
machinery can be made to display human-like intelligence (and the ensuing 


* I am grateful to Claude Lamontagne and Jean-Pierre Delage for their comments on this paper. 


308 Jean-Roch Beausoleil 


attempts at carrying out this programme) sparked a philosophical debate 
which has raged for the past thirty-five years. We shall not be addressing 
Turing’s contention directly; rather, we shall attempt to contribute to an even 
more general issue which we nevertheless believe to have a direct bearing on 
it, namely the relation between certain developments in metamathematics 
(and by extension the philosophy of mathematics) and epistemology. Strictly 
from the point of view of the formal referents attendant upon Turing’s 
proposition, it should indeed appear surprising that they could have led to, or 
somehow been transformed into, an epistemological argument about the 
nature of intelligence, and thereby the possibility of explaining or understand- 
ing it. The mystery is only somewhat reduced by recognizing the introspective 
basis (Hodges [1983], ch. 2) of Turing's formalization of the concept of decision 
procedure; It does not vanish altogether. The step from a formalism to the 
possibility of a theory or explanation must somehow have been directed; it is in 
need of a ‘rationalization’. Our effort here is to indicate one possible such path, 
or at least the grounds on which it might be or have implicitly been lald. 

We will first outline the connections between the formalism and a particular 
epistemology. They will be developed at two levels: the formal and the 
contentual. This will allow us to discuss and broaden a principle, implicit in 
Turing's programme and explicit but restricted in the epistemology, addressing 
the issue of the relevance or relation of the ‘formal’ propositions (the 
metamathematical and the epistemological) to a theory of subjective know- 
ledge processes such as intelligence. 

At the formal level, the idea is to show for one specific epistemology, namely 
Popper's, that structurally it partakes of the formal theory in a manner 
resembling that postulated between intelligence and a central concept of this 
theory. Thus at this level we are drawing a parallel between two ‘formal’ 
systems. Turing's claim is thus strengthened on the grounds that, at least in its 
form or underlying structure or logic, epistemological understanding is itself 
bound to the formalism. In other words, the structure of knowledge processes 
proposed by Popper is identical with the form (i.e. it is computational, in the 
fundamental sense of metamathematics) which a theory of knowledge must take 
according to Turing. That is, the Popperian rationale for the evolution of 
knowledge coincides with a central concept of the formalism. This explains 
how and why the formalism is relevant to the problem of understanding 
understanding.! Conversely it could be argued that an important part of the 
quest that nourished the development of the formalism is consistent with the 
goal of epistemology, or was in fact epistemological. 

At the contentual level, the argument turns to some important issues 


l The intuitive appeal of the ‘self-referential’ provisions of metamathematics, as used in Gédel’s 
incompleteness theorem, understood as indicating the relevance of the formalism to a so-called 
computational study of mind, could find its epistemological counterpart in the ‘self-referential’ 
problem of understanding understanding. See further below, espectally Section 3. 


The Metamathematics—Popperian Epistemology Connection 309 


discussed both in (Popperian) epistemology and in the philosophy of 
mathematics. The connections deepen with respect to these at least, since the 
distinction between the purely epistemological and the purely metamathema- 
tical aspects of the problems cannot really be made. The issues we examine are 
simply common concerns. Thus the parallel is not limited to the formal level. 
Indeed at the contentual level these issues could probably all be argued to 
center upon the problem of the relation between the theory of objective 
knowledge and the theory of the knower. This problem obviously lies at the 
heart of Turing’s proposition. 

From the point of view of the development of certain key components of 
Popper’s philosophy, these contentual issues could probably be linked to a 
broader problem-context which also impelled, and in the unfolding of which 
would also have participated, the development of metamathematics. But our 
more modest perspective leads us rather to launch our discussion from the 
following observation. The theory of knowledge can be approached from two 
directions: [1] there can be, as in Popperian epistemology and metamathema- 
tics, a theory of objective knowledge processes; [2] but there can also be, as 
Turing suggested, a (so-called computational) theory of subjective knowledge 
processes such as intelligence. Now although the distinction between objective 
and subjective knowledge processes might seem natural, in light of Turing’s 
proposal (as well as Popper’s in fact) it actually largely fades away. The 
triumph of mathematical logic lies, as will be seen in Hilbert’s quote below, in 
its objectification of logical thinking processes through a logical calculus (a 
method that centered upon, and eventually required the formalization of, the 
concept of rule). As a result, these processes, formerly held to be largely subjective, 
became mathematical objects, i.e. objective entities. This paved the way for an 
objective study of logical thought. By the same reasoning it follows that for an 
enlarged domain of subjective knowledge processes such as are involved in 
vision and intelligence, the same metamorphosis must occur. Again, as in 
mathematical logic, this transition into the objective realm will presumably be 
achieved through some calculus. At this point however, the rules of objective 
knowledge must also apply to these 'subjective' processes. To a very significant 
extent, it is mathematical logic (and especially Turing) that has shown us how 
to approach this enlarged object. 

In fact, Popper himself has apparently done something similar with his 
‘principle of transference’ which states that ‘what is true in logic [objective 
knowledge] is true in psychology [subjective knowledge]? Within the 
, Popperian framework, this principle hinges upon an objective appreciation of 
the products of thought, i.e. on the view that these products are autonomous (or 
subject-independent) entities. On the other hand Popper has shown only 
sporadic interest in metamathematics? and, apart from the important 
2 As well as ‘in sclentific method and in the history of science' (Popper [1972], p. 6) 


3 In 1947-48 he published a series of papers in intuitionistic logic. Complete references can be 
found in Schilpp [1974] on pages 1217-19: see under 1947(a), (b), (c), and 1948(b), (c). (e). 


310 Jean-Roch Beausoleil 


acquisition of Tarski's theory of truth’, little if any in the relation between it 
and his epistemology.? Indeed this seems remarkable, in view of the fact that 
Popper has placed the problem of the evolution of knowledge at the center of 
epistemology, while problems concerning the processes of logic (e.g. deriva- 
tion) lle at the heart of metamathematics, or mathematical logic. It turns out 
however that it is also possible to understand the principle of transference with 
respect to a theory of the (subjective knowledge) processes from which these 
products issue (i.e. the mechanics of their emergence), the possibility of which 
owes a great deal, as indicated above, to the metamathematical domain. 
More specifically, at the formal level the parallel will be outlined in terms of 
the co-incidence of the metamathematical concept of recursion with the 
structure of the schema proposed by Popper. At the contentual level it will be 
developed with respect to a set of issues bearing on the relation between formal 
constructs and reality, namely ‘self-reference, objectivity, and the question of 
limits. Our proposal should be viewed as an attempt to bring this problem of the 
relevance of abstractions for penetrating reality to the realm of subjective 
knowledge or knowing, itself considered as belonging to the same reality; this 
is what Turing's programme and Popper's principle are about. Such an 
understanding will naturally be grounded in popperian epistemology, and will 
take the form of a slightly modified, or extended, principle of transference. The 
epistemological ‘rationalization’ of Turing's programme is thus achieved. 


2 PS & RS: THE FORMAL CONNECTION 


The first element of our proposition is that the form of Popper's picture or 
schema of the knowledge acquisition process (or the structure of its 
characteristic proceeding) corresponds to the metamathematical notion of 
recursion, a notion which recognizably lies at the heart of Popperian 
epistemology, and most clearly within what might be called the Popperian 
schema (henceforth referred to as ‘PS’) for the essential dynamics or logic of 
research.°® 


* See his ‘Philosophical Comments on Tarski's Theory of Truth’ (Popper [1972], ch. 9) It is 
important to note that this connection between Popperian epistemology and Tarski’s 
metamathematical result is different from the one we are trying to elucidate: whereas the 
theory of truth ‘contributed’ to Popper's schema (by allowing the explicit and non-ambiguous 
(re-)introduction of the notion of truth within ft), we are not attempting to ‘complement’ or 
‘enhance’ it in any way, but merely to draw a certain parallel between Popper's epistemology 
and other central concepts and issues in metamathematics. 

5 The only explicit, but very indirect, statement we have found to that effect ts in Popper's 
Autobiography (Popper [1974], p. 105) where he recounts a meeting with Gödel in 1950 in 
which they discussed ‘some aspects of the possible significance of his incompleteness theorem * 
for physics'. Cp. this with the discussion on p. 104. 

6 Contrary to the English edition of Popper's Logik der Forschung, we do not equate 'Forschung' 
with ‘discovery’, a word we believe, in our context at least, to be potentially misleading (the 
book does not offer a theory of discovery); this is apparently less so for ‘logic’; in fact the word 
immediately suggests that there is a certain logical structure to the research process and 
moreover that it can be understood. x 


The Metamathematics—Popperian Epistemology Connection 311 


In order to substantiate this idea we shall need to look more closely at 
recursion, as well as at PS. 

Although the series of problems (e.g. the paradoxes of set theory, the nature 
ofthe infinite) and issues (e.g. axiomatics, consistency and completeness, proof 
methods) in the foundations of mathematics responsible for the emergence of 
metamathematics and culminating in the set of fundamental results issuing 
therewith in the thirties—1.e. from the publication of Gódel's incompleteness 
theorem (Gödel [1931]) to Turing's equivalence proof of ‘Computability and 4- 
definability’ (Turing [1937])—i highly interesting in itself, for our purposes it 
is only necessary to extract a few key elements from those developments and 
results. 

In the first place, let us start with a statement concerning the goal of 
mathematical logic: 


The purpose of the symbolic language in mathematical logic is to achieve in 
logic what it has achieved in mathematics, namely, an exact sctentific treatment 
of its subject-matter. The logical relations which hold with regard to Judgments, 
concepts, etc., are represented by formulas whose interpretation is free from the 
ambiguities so common in ordinary language. The transition from statements to 
their logical consequences, as occurs in the drawing of conclusions, is analysed 
into its primitive elements, and appears as a formal transformation of the initial 
formulas in accordance with certain rules, similar to the rules of algebra; logical 
thinking ts reflected in a logical calculus. This calculus makes possible a 
successful attack on problems whose nature precludes their solution by purely 
intuitive logical thinking. Among these, for instance, is the problem of 
characterizing those statements which can be deduced from given premises,’ 


Mathematical logic is therefore interested in doing with the (logical) processes 
involved in mathematics (or a given mathematical system, such as number 
theory) what mathematics itself has done for its own objects, namely to 
formalize them; this entails that ‘the mathematical theory . . . becomes itself 
the object of a mathematical study’ (Kleene [1952], p. 55). Now a 
mathematical system involves propositions (understood in a very general 
sense, which includes everything from ‘1+1=2’ right up to Golbach’s 
conjecture) and, e.g., the establishment of their truth, which in turn involves 
judgments and the drawing of conclusions based upon the application of rules.® 
The idea of rule seemed to lie at the core and to be enmeshed in much of 
mathematics. (The one rule here seems to have been: there must be a rule, 
although we do not know why that intuition should have turned out to be so 


7 Hilbert & Ackerman [1938], p. 1, emphasis added. The idea that logical thinking could be 
reflected in a logical calculus had long been anticipated by Frege. (See e.g. Frege [1884] and 
[1918].) 

* The notion of rule encompasses not only computational rules of the numerical kind encountered 
In arithmetic but also the non-numerical ones as well, e.g. inference rules. The best known rules 
are of course of the first kind, but they are not necessarily the most ‘representative’. However 
for arithmetic at least, Gödel has shown them to be equivalent. 


312 Jean-Roch Beausoleil 


useful.?) In order to ‘mathematize’ or formalize these logical operations (so that 
mathematical propositions concerning them be possible),!? it was necessary to 
explicate these operations, and therefore the concept of rule, in non- 
ambiguous and symbolic terms.!! This is the task metamathematics had set for 
Itself. 

One of the most remarkable outcomes of these efforts was that within a very 
short period of time (1931-7) not only one but in fact three different 
formalizations of the notion of rule (or algorithm, procedure, etc.) were 
achieved. Recursiveness (or more precisely, general recursiveness) was one of 
them, the other two being 4-definability and Turing machine computability. 
The fact that these various schemata are equivalent gave credence to the belief 
that the intuitive notion of rule had indeed been formally captured. The non- 
trivial nature of this assertion is easily appreciated by observing that each 
rendition arose out of quite distinct metamathematical endeavours, only 
Turing in fact having set out to tackle the problem directly. (See Kleene [1981]; 
of. p. 56) 

The general recursive functions are those that can be defined with ‘the 
apparatus of primitive recursion, together with use of the minimization 
operator y’.!? Primitive recursion, on the other hand, takes the form of an 
inductive definition in which ‘the value of the right-hand side may be calculated in 
terms of already defined functional values’. 

Recursion is not itself a rule (although it could be E EE a metarule); it 
is, rather, the precise expression of the nature of rule. It is a mould or a shell 
into which can be fitted, or according to which can be structured, any 
particular algorithm or rule. It embodies a form of understanding, one 
furthermore, as argued in the next paragraph, that is basically genetic. 

Indeed it will have been noted that the recursive schema (henceforth 
referred to as ‘RS’) presents a highly constructive image. This is not 
coincidental. In the first place, it is no doubt due to the finitistic imperative of 
metamathematics (which antedates the formulation of RS), according to 
which mathematical entities had to be produced (and could not merely be 
referred to in absentia as is sometimes done in existence proofs). But there is a 
second reason for this. It has to do with Hilbert’s programme of ‘scientific : 


? Possibly because 'intuition' has not yet been formalized, and therefore remains . . . intuitive. 

10 Such as the following: that every true proposition in arithmetic is provable or derivable from the 

axioms and inference rules of a system which can represent arithmetic, i.e. that arithmetic is 
complete. (The fact that this 1s not so was shown by Gödel in 1931; see Gödel [1931].) 

11 The rules involved in that enterprise are of course less clear (i.e. they remain intuitive) than the 
very explicit statements of rules which are sought for the ‘object theory’. Kleene [1952], p. 62, 
writes: ‘Rules have been stated to formaltze the object theory, but now we must understand 
without rules how those rules work.'. 

12 Minsky [1967], p. 184. There is no need here to go into the details of the nature and necessity of 
the minimization operator as well as the exact nature of primitive recursive functions. 

13 Emphasis added; ibid., p. 175. The ‘equations’ can also be found on that page. In the general 
case the ‘already defined values’ need not be those of the same function. 


The Metamathematics—Popperian Epistemology Connection 313 


treatment’, whereby a reduction to the most primitive constituents and 
elementary ‘laws’ is sought allowing for a (hopefully fool-proof) re-construction 
of the whole edifice of mathematics. This is the axiomatic method. It leads quite 
naturally to a genetic view of mathematics, in which the end-products (e.g. 
arithmetical propositions) are to be shown deducible from, or a consequence 
of, the ‘actions’ of a very small initial (irreducible) set of elements and principles 
(or axioms). Following Gódel's and others’ work, it has become clear that for 
sufficiently complex systems (such as arithmetic) it is impossible to completely 
satisfy this requirement; however no alternative finitary or mechanical 
scheme, supposing one to be possible, allowing any, let alone a full, 
systematization of an open body of knowledge such as arithmetic has as yet 
emerged. Nevertheless the practice of axiomatization has become common for 
sclentific theories; and no doubt the ‘synthetic method’ of proceeding, in a 
hierarchical manner, from the simple to the complex, in a word of considering 
phenomena as ‘growth’ phenomena, presides over a large number of our 
attempts at systematization. Indeed understanding itself can be understood as 
a constructive process. 

We now turn to PS, which incorporates all of the essential features of 
Popper’s epistemology. At this point it is important and instructive to remind 
ourselves that according to Popper ‘The fundamental problem of the theory of 
knowledge is the clarification and investigation of [the] process by which... our 
theories may grow or progress.’ (Popper [1972], p. 35; author's italics) This 
process is summarized by him in the following manner (ibid., p. 119):1* 


5 P,2TI—EE-AP,;,;. 


‘That is,’ explains Popper, ‘we start from some problem Pi, proceed to a 
tentative solution or tentative theory TT, which may be (partly or wholly) 
mistaken; in any case it will be subject to error-elimination, EE, which may 
consist of critical discussion or experimental tests; at any rate, new problems P; 
arise from our own creative activity; and these new problems are not in 
general intentionally created by us, they emerge autonomously from the fleld of 
new relationships which we cannot help bringing into existence with every 
action, however little we intend to do so.' (ibid.) 

Thus the process is idealized as being fuelled by the unforeseen (i.e. new and 
mostly problematical) consequences of our attempted solutions (or simply any 
action, which can then also be seen as belonging to the class of 'tentative 
solutions', a view that Popper has pursued and developed in the context of the 
relevance and relation of PS to psychology and to the theory of evolution (ibid., 
chapters 6 & 7)) to difficulties that arose out of the attempted solutions of 
previous problems, etc. In fact these unintended consequences form the main 


14 We have slightly altered the subscripts in the ‘equation’ in order to bring out more forcefully the 


open-ended character of PS. Furthermore, with the possible exception of the first term, each 
term should be considered a set. 


314 Jean-Roch Beausoleil 


functional link, occurring through what Popper describes as ‘a feed-back 
effect’ (ibid., p. 119), between ‘ourselves’ and the objective domain of problems 
and theortes which he calls ‘world 3’. 

Interestingly, the ‘standard examples’ which Popper cites of this (in fact 
partial) autonomy ‘may be found in the theory of natural numbers’, as for 
instance the distinction between odd and even numbers, or the existence of 
primes, etc. (ibid., p. 118), which were discovered or invented (and in the context 
of which he actually gives the example of recursive functions!?). 

It is worthwhile to note that PS has been argued by Popper, although it ts not 
a biological theory, to have some relevance to the problem of evolution, and 
although it is not a historical doctrine, to be able to shed light on historical 
problems (especially in the history of science, but also in the history of music 
for instance), and finally although it is not a psychological theory, to provide 
important insights on language, learning, and perception.!® These problem- 
domains arguably share a concern with progression, change, and emergence, 
which Popper's discussions illuminate. But why should such vast areas of 
human activity be appreciable along those lines; or, why should the recursive 
structuring inherent in PS be so productively applicable to them? We believe it 
is because RS defines the structure of any explanation. 

The similarity of the basic structure or form of Popperian epistemology, as 
expressed in PS, to the formal notion of recursion found in mathematical logic 
should now be apparent. On the one hand PS suggests that all knowledge 
should be viewed as a ‘developmental’ outcome of an extant knowledge base, 
the key force within that evolution being the inescapable (or unintended) 
problematical consequences attendant upon any current level of knowledge; 
on the other hand, RS provides the means for defining or effectively 
constructing a vast class of entities (the ‘effectively calculable’ functions), in a 
manner which allows for the definition to be expressed in terms of the extant 
part of the definition (along with an independent operator u). Thus in both 
cases there is an explicit reference to ‘past’ levels of knowledge or definition, 
but also and more importantly, a recognition (and proof, in the case of the 
metamathematical formulation) of the nearly complete sufficiency of that 
recourse for purposes of production or explanation. For our purposes, it is not 
important that it be complete, merely that it is common, and central, to both 
mathematical logic and Popperian epistemology. 


15 Cf, ibid., p. 118 and ch. 3, s. 6, ‘Apprectation and Criticism of Brouwer's Epistemology’. 
(Brouwer is the founder of the tntuttionistic school in metamathematics.) With respect to the 
previous example of primes, and using Popper's terminology, we could rephrase Gódel's 
incompleteness theorem by saying that there can be no ‘axiomatic definition’ of the natural 
number system such that all of its ‘unintended consequences’ can be made apparent: thus we 
are forced to invent and discover. It is on the basis of such fundamental results that a good case 
a be made for the Independent reality of these systems. Cf. e.g. Barker [1969]. Cp. note 18 

ow. 

16 Cf, Popper [1972] and [1968], as well as his Autobiography (Popper [1974]). 


The Metamathematics-Popperian Epistemology Connection 315 


3 CONTENTUAL CONNECTIONS 


The question of reference, the issue of objectivity, and the problem of limits 
focus on further fundamental aspects of knowledge. Thus it is not surprising to 
see them arise in both mathematical logic and epistemology. In the present 
section we briefly explore, mainly from the epistemological standpoint, some of 
the intricate links that connect and bind them on these contentual matters. It 
should be noted that the formal level of argument set out in the previous 
section provides the background against which these contentual matters are 
approached, and furthermore that the two are not independent. 


3.1 'Self'-reference 

The 'object theory' of Popperian epistemology is (objective) knowledge; the 
goal of epistemology is to understand it: how it came about (i.e. how it evolves) 
and what its ‘structure’ is. The ‘object theories’ of mathematical logic are 
mathematical systems; its goal is to analyse their properties in terms of 
axiomatization, methods of proof, consistency, completeness, etc. As usual, 
one represents or somehow refers to its object; however here ‘reference’ is also 
part of the object: scientific theories refer to reality, while mathematical 
propositions refer to other mathematical entities (e.g. variables stand for 
numbers). In fact ‘reference’ is at the core of all knowledge. This peculiar state 
of affairs has led, in both the case of metamathematics and (at least) of 
Popperian epistemology, to the definition of quite consistent general schemata 
of reference, as formulated in PS and RS: in particular they share and revolve 
about the notion that the domain of the ‘knowable’ (the derivable; scientific 
theories) is critically, or ‘sufficiently’, rooted in an immediate, or ‘self’-, 
reference. That is, a state of knowledge is always described in the context of 
another state of knowledge. 

This much more primitive, or bare, self-reference should not of course be 
confused with the personal experience of the self, as is sometimes done when 
the former is held to be a sort of key to the latter, or when they are 
argumentatively used as equivalents; such an ‘interpretation’ has to our 
knowledge shed little light on the experience. The notion that it is an important 
datum may nevertheless be seen as pointing to the much greater importance of 
the ‘lesser’ reference. Thus the subject is absent from any critical consideration 
in both metamathematics and Popperian epistemology; but objectively, ‘self’- 
reference is not. 


3.2 Objectivity 

In studying Popper’s work one quickly develops a sense for the amazing (if not 
miraculous) fact that knowledge, and in particular scientific knowledge, is at 
all possible. Although Popper warns that we should not try to explain why our 
theories are successful (because as he argues we cannot), there is nevertheless 


316 Jean-Roch Beausoleil 


a residual problem linked to the fact that our theories and the phenomena 
upon which they bear are different things (even if that to which they refer is 
largely dictated by our theorles—cf. previous sub-section). In our view, the 
amazing character of knowledge stems from the fact that given this difference 
our theories are nevertheless sometimes found to have objective relevance (e.g. 
they allow us to predict events). Our comment in that respect is simply the 
following, which concerns what Popper calls the autonomy of knowledge. 
Recall that the 'standard examples' he gave for explaining it were drawn from 
the mathematical domain. Indeed, this suggests that what might be called 'the 
objective connection’ finds not only its purest instantiation, but also its 
'reason', in the mathematical domain, at least in the sense that no more 
primitive level than the mathematical could be found to explain it. This seems 
to us to open a door onto a possible understanding of the objective connection: 
it could be, simply, that the most important unintended consequence of our 
theoretical constructions is their objectivity, that is, their potential applicability 
to the world. Of course we wish them to be relevant right from the start; 
however it is not because we wish them or strive for them to be so that they are 
or can be. That is, the problem of objectivity is not a question of being right or of 
developing a true or successful theory: it is rather a more general question of 
producing a theory at all, i.e. producing relevant statements about the world. 
To some extent Popper has recognized this.!7 Certainly, in wishing our theories 
to be relevant to something which largely escapes us, it makes sense that that 
by which a connection is established also partakes of that property. Thus it is 
possible for something to be objective and at the same time to be ‘unreal’. Our 
knowledge of the world is like the world in its autonomy; and it also largely 
escapes us. 

We would add at this point the following brief remarks about this problem. 

As concerns the system of natural numbers, not only is it the case that they 
lead to an unforeseeable set of objective consequences which must be 
discovered, but they lead to an inexhaustible set of such consequences, a set 
which, furthermore, cannot, at least with the present methods, be completely 
systematized. (Gódel [1931]) This reveals a new level of consistency between 
PS and mathematics, one based on the strange proposition that because of this 
fact our mathematical constructions themselves can be considered as rich as 
reality, or at least that they constitute a reality of their own. Popper of course 
talks about world 3 (to which, incidentally, belongs mathematics —cf. Popper 
[1972], p. 136), and in a manner highly reminiscent of the way in which 
mathematical philosophy discusses the question of the status of mathematical 


17 Popper ([1972], p. 159) states: ‘it is possible to accept the reality or (as it may be called) the 
autonomy of the third world [Le. world 3], and at the same time to admit that the third world 
originates as a product of human activity’. (author's italics) 


The Metamathematics-Popperian Epistemology Connection 317 


entities, as well as the limits to our knowledge of them.!? However, Popper has 
not derived or even based his arguments concerning the permanent short- 
comings (as opposed to limits in the formal sense, although his references to 
mathematical logic sometimes appear not to be only for illustrative purposes) 
of ‘ordinary’ knowledge upon such metamathematical results, although he 
was aware of their formation firsthand. His starting points were rather the 
problems of demarcation and induction. Nevertheless, we again find here a 
coherence between the ‘unending quest’ and the formal theorems. 

What does allow Popper to refer to these results as ‘extensible’ to scientific 
propositions (for instance) is the fact that scientific theories are formulated, a 
sine qua non for critical evaluation and error elimination; since a formulation 
occurs with the use of a certain language, and since some (particularly Tarski’s) 
fundamental metamathematical results pertain to language, he is quite 
justified in doing so. Of course this is limited to formulated, or objective, theories 
and says nothing about their origin nor their possibility; in order again to 
‘extend’ the formal results to that level, it would be necessary to be in the 
presence of a formulated theory about the origins of the theory. This shows 
that it is not possible to derive from the formalism a proof of the impossibility of 
explaining such things. Thus the formalism, even though it serves as the 
‘foundation’ for objectivity, does not allow us to place a definite boundary upon 
the objective domain. 


3.3 Limits 


If metamathematics’ rule is that ‘there must be a rule’, Popper's rule seems to 
be that ‘there is no rule’ (e.g. of induction!?). Nevertheless, he formulates one, 
viz. the method of critical trial and error elimination. This is nearly a play on a 
word, for we would argue that in fact these two statements are equivalent. 
Metamathematics has shown for instance that there cannot be a single rule 
which would allow us to determine for every case the provabllity or truth of a 
given mathematical proposition. Closely related to this, Turing has, 
furthermore, shown that there is no rule which can help us decide in general if 


18 p. Bernays, G. Gentzen, K. Gödel, A. Heyting, among others. Popper says ([1972], p. 161), with 
reference to Gödel, that ‘an infinity of problems will always remain undiscovered’ and that ‘In 
spite and also because of the autonomy of the third world, there will always be scope for original 
and creative work.'. Compare this with Gentzen's (Gentzen [1969], p. 240) ‘paraphrase’ of 
Gédel’s incompleteness theorem: ‘for number theory no adequate system of forms of inference 
can be specified for once and for all,...new theorems can always be found whose proof 
requires new forms of inference’. 

19 ‘Sensible rules of inductive inference do not exist.’ (Popper [1974], p. 117) Note that, at least 
from Hume onward, epistemologists had also been searching for a ‘rule or principle on which all 
theories should be based’ (ibid., p. 118; emphasis added). 

20 This ts the decision problem for formal systems. Such a general procedure was shown by Turing 
to be impossible; see Turing [1936]. 


318 Jean-Roch Beausoleil 


a given procedure is in fact one.?' Thus we could just as well rephrase 
metamathematics’ rule as: ‘there is no (general or unique) rule’. Conversely, 
Popper has explicitly recognized the procedural nature of the scientific 
endeavour. He says: ‘Only if we know how to abide by a refutation do we know 
how to speak about reality [presumably do science]. If we wish to formulate this 
readiness or "knowledge how”, then we have to do it again with the help of a 
rule of procedure. It is clear that only a performance rule can help us here, for 
speaking about reality is a performance.'?? Thus the sctentific enterprise, as far as it 
is formulated by a theory of scientific knowledge, is also rule-based.?* 

The theory of computation teaches the limits to what can be instantiated in 
a rule (cf. examples above). However, in light of the fact that all scientific 
theories are expressions of rules, and the fact furthermore that even the theory 
of that knowledge ts formulated in a set of methodological rules, there seems to 
be no alternative: our quest for understanding will always be a quest for rules. 
But rules can be modifled, and they are open to criticism; they are limited (e.g. 
when they are wrong), but never absolutely limitative, i.e. there is no inberent 
boundary, short of the totality, to the domains upon which they can bear. 
Moreover, the limits that would apply for instance to a computational theory of 
intelligence are also applicable to the ‘metatheory’ of intelligence (i.e. criticism 
ofthe possibility of achieving one). Indeed it is fallacious to try to infer from the 
various limitations attendant upon the formal or (meta)mathematical know- 
ledge any sort of absolute limitations attendant upon a theory; it is not because 
there are limits to theory, even on supposed formal grounds, that no theory is 
possible. In the Popperian context the very existence of knowledge contradicts 
this, for most of it is error-bound. Interestingly the above view is arguably 
regarded by Popper (with respect to physics at least) as self-evident, ‘in view of 
the mathematical background of physics'. (Popper [1974], p. 104) Thus 
scientific theories suffer from the limitations imposed by formal theory because 
they are formalized (or formalizable, or can be argued to be); but the formalism 
says nothing about the possibility of a theory. It does however, insofar as there 
is this assoclation between the two of them, and even more importantly 
between It and epistemology, allow us to suppose that the formalism cannot be 


21 This is the halting problem, Le. the question as to whether there can be a way of deciding if any 
given programme will halt or come to a decision, i.e. whether it constitutes an effective decision 
procedure; see also Turing [1936]. 

22 Popper [1968], p. 212. All except the next to last are the author's italics. 

25 In order to avoid a possible source of confusion at this point, it should be realized that we are not 
proposing that the scientist's work is nothing but some mechanical application. Indeed original 
scientific or mathematical work occurs whenever there Is a break from mechanisms, Le. the 
known; nevertheless these leaps are organized according to PS m a way which is mechanical— 
albeit in a manner similar to the way in which the theory of evolution offers a mechanical 
explanation of the development of life. This ts the level at which the parallelism we are 
indicating occurs. 


The Metamathematics—Popperian Epistemology Connection 319 


surpassed: for the very act of formulating a theory, even an epistemological 
theory, immediately brings it within the range of its grasp.?* 


4 THE EXTENDED PRINCIPLE OF TRANSFERENCE 


We stated above that ‘the formalism says nothing about the possibility of a 
theory’; yet this is what Turing was talking about. How then could this step 
have been taken; what could have allowed him to nevertheless conjecture the 
relevance of the formalism with respect to the problem of intelligence? 

- We have in fact already stated our answer, which lies in the establishment of 
a parallel between a theory of knowledge (namely Popper’s, and more 
specifically, PS) and mathematical logic (specifically, RS). Quite simply, the 
fundamental consistency between them is founded on the notion that in 
reflecting upon problems of knowledge identical forms of understanding have (more 
or less independently) emerged. Thus the belief that the organ of knowledge, 
namely intelligence, partakes of the same dynamic form is no longer even 
surprising. . 

According to Popper, the key to the transition from subjective to objective 
knowledge consists in formulating or expressing a proposition. Thus if some 
essential and incontrovertible subjective component, ie. one which by 
definition cannot be formulated, does enter into intelligence, it will be true that 
no 'satisfactory' theory of intelligence can ever be developed, of the type for 
instance which could be brought into play in a machine. This means that such 
a theory or machine, if ever one is achieved, will be subjectless (Le. it will not, 
since it cannot, comprise some irreducible subjective element), a consequence 
which only appears paradoxical in view of the persistent, but up until recently 
largely unquestioned, belief in the critical importance (to state it vaguely) of 
‘the subject’ or of the unavoidable Involvement of a ‘purely’ subjective domain 
or entity for human intelligence.?^ 

Interestingly, the whole of Popperian Pugeniiooy is founded on a radical 
shift away from a very similar bias in modern epistemology, which led him to 
recognize 'the priority of the study of logic over the study of subjective thought 
processes’ (Popper [1974], p. 61; author's italics) and to an ‘Epistemology 
Without a Knowing Subject’ (Popper [1972], chapter 3). He has moreover 
argued that the study of these subjective thought processes can be greatly 
enhanced by considering them from an objective point of view, that is in terms 


24 Following his criticism (Popper [1972], p. 225, note 39) of Turing’ 8 ‘tmitation game’ proposed 
in Turing [1950], Popper might also describe this argument as ‘an intellectual trap’. 

25 Although he 'predict[s] that we shall not be able to build electronic computers with conscious 
subjective experience’ (Popper & Eccles [1977], p. 208, emphasis added), Popper does not say that 
the latter is necessary for intelligence; however he is possibly leaning tn that direction when he 
suggests that the path to the creation of a ‘mechanical’ intelligence might be found, as ín 
nature, in the artificial creation of life. 


320 Jean-Roch Beausoleil 


of their interaction with objective knowledge.?5 In the same manner, one could 
say that Turing proposed a psychology without a ‘subject’. (This is no doubt 
the reason why his programme has been called behaviouristic.) Thus the 
subject itself is not negated, it is merely removed as the essential component 
within psychology, just as it has been removed as the essential component in 
epistemology by Popper. 

Again, in Popperian epistemology the key to the transition from logic to 
psychology is the principle of transference.?? Now the logic to which he refers is 
essentially the ‘logic’ of trial and error (Popper [1974], p. 1024), which is the 
logic of the knowledge process. Yet, much more is involved in subjective 
knowledge processes than is indicated by this formal rendering, namely its 
mechanics.?* However Popper does not propose such a theory of subjective 
knowledge processes. On the other hand metamathematics has developed 
some elements of the logic of logic, or the logical process, a logic moreover that 
can also be transferred, in the Popperian sense, into the mechanics of subjective 
knowledge. Thus, we are led to formulate an extended principle of transference 
which states that what is true in mathematical or symbolic logic is true in 
psychology (understood as the theory of subjective knowledge processes). We 
believe this to have been the basis of Turing's programme. 

As we understand it, the original principle of transference carries the 
methodological rules inherent to PS into the domain of subjective processes. The 
extended principle not only abides by this but goes further in proposing that 
the subjective contents themselves and the mechanisms and operations that 
give rise to them can be stated and encompassed within the metamathematical 
formalism, in particular recursion and the theory of computation.?? This view 
naturally leads to the adoption of a constructivist viewpoint for psychological 
theory.?? 


26 Thid., chapter 4, ‘On the Theory of the Objective Mind’. Note that while Popper admits the 
existence of subjective knowledge, the only differences between it and objective knowledge are 
that the latter {s ‘linguistically formulated’ and can consequently be ‘submitted to critical 
discussion’; they are both described as ‘expectations’. (Popper (1972], p. 66) 

27 See note 2 above. Although ‘admittedly a somewhat daring conjecture in the psychology of 
cognition or of thought processes’ (Popper [1972], p. 6), it should not be considered more so 
than the reverse (but now ancient) humean ‘inductive conjecture’ that purportedly drew 
epistemological consequences from psychological considerations. 

28 pg. the mechanics of vision; see for Instance Lamontagne & Beausoleil [1982]. 

79 In the present context tt Is only a matter of circumstance that Turing's formalization is much 
more directly related to his programme—because it is formulated in terms of computing 
machinery—than is recursion, which ts nevertheless equivalent to the former and from which 
can much more clearly be apprehended the fundamental relationship discussed here which we 
believe to be the basis of the said programme. 

30 This can again best be seen in the context of vision research, where the perception of a ‘higher’ 
entity is thought to be dependent upon the detection of ‘lower’ ones. For example the perception 
of curves ts based upon the perception of line segments, which in turn is based upon the 
perception of ‘points’. Thus the theory of curve perception would be rooted in the theory of line 
segment detection, etc. The constructivist approach in effect states that there cannot be an 
Independent or ‘direct’ perceptual knowledge of higher entitles; they are always a function of 
others. But this is equivalent to maintaining that perception is recursively structured. See also 
Lamontagne & Beausoleil [1982]. 


The Metamathematics-Popperian Epistemology Connection 321 


Stated another way, the extended principle of transference holds that, just as 
the original principle precludes the existence of knowledge processes that 
could overthrow (or are not amenable to) the PS logic, there do not exist 
knowledge mechanics that fall outside the descriptive compass of RS. 

No doubt this is not the rationale Turing entertained when he wrote on 
‘Computing Machinery and Intelligence’. But in view of the formal back- 
ground to his proposition, in view of the formal and contentual RS-PS parallel, 
and in view of Popper's principle of transference, we believe that the extended 
principle of transference can serve to elucidate and indeed constitute a new 
basis for Turing’s programme. Through this parallel Turing’s programme and 
the principle of transference are united, and while the former gains in 
epistemological clarity, the latter is reinforced. 

Since our deepest understanding of the structure and processes of know- 
ledge, both from the formal-metamathematical and eptstemological stand- 
points, point in the same direction, our proposition, encapsulated in the 
extended principle of transference, simply amounts to saying that this should 
also be the case for the structure and processes of subjective knowledge. 

Thus Popperian epistemology assumes, in the context of theories and 
research on subjective knowledge processes, and more specifically with respect 
to the logic of Turing’s programme, an heretofore unsuspected relevance. 


5 CONCLUDING REMARK 


Turing speculated and argued that the principles of computation which he had 
defined were not limited to the purely logical domain but could potentially be 
brought to play upon the whole of subjective knowledge processes,*! Likewise 
Popper has suggested that the logic of objective knowledge which he has 
proposed provides a framework for the theory of subjective knowledge. In both 
cases there is thus a transfer from the objective to the subjective realm. We 
have tried to show that Turing's implicit use of such a principle rested upon a 
larger kinship between the epistemological and metamathematical endea- 
vours, a similarity which was observed at the formal as well as contentual 
levels, and which led us to formulate the extended principle of transference. Of 
course neither Popper's logic of research (including his principle) nor Turing’s 
logic of computation (including his programme) constitute in themselves a 
theory of subjective knowledge; rather they define the nature of these theories, 
or of understanding and explanation for knowledge phenomena. 


School of Psychology 
University of Ottawa 
TA 


1 This point is further elaborated in the first chapter of Beausoletl [1987]. 


322 Jean-Roch Beausoleil 


REFERENCES 


BARKER, S. F. [1969]: ‘Realism as a Philosophy of Mathematics’, tn J. J. Bulloff, T. C. 
Holyoke and S. W. Hahn (eds.) Foundations of Mathematics: Symposium Papers 
Commemorating the Sixtieth Birthday of Kurt Gödel, pp. 1—9, Springer-Verlag. 

BARWISE, J. (ed.) [1982]: Handbook of Mathematical Logic, second edition, Amsterdam, 
North-Holland. 

BEAUSOLEIL, J. R. [1987]: On Deriving Percepts and Producing Movement: Theoretical 
Studies of Periphery-Bound Knowledge Processes, Doctoral Dissertation, Ottawa, 
University of Ottawa. 

FREGE, G. [1884]: Die Grundlagen der Arithmetik, Breslau, Koebner. Translated as The 
Foundations of Arithmetic by J. L. Austin, Oxford, Basil Blackwell, 1953. 

FREGE, G. [1918]: ‘Der Gedanken: Eine logische Untersuchung’, Beitráge zur Philosophie 
des deutschen Idealismus, 1, pp. 58-77. Translated as ‘Thoughts’, in Logical 
Investigations, P. T. Geach (ed.), Oxford, Basil Blackwell, 1977. pp. 1-30. 

GENTZEN, G. [1969]: The Collected Papers of Gerhard Gentzen, M. E. Szabo (ed.), 
Amsterdam, North-Holland. 

GÖDEL, K. [1931]: ‘Über formal unentscheidbare Sätze der Principia Mathematica und 
verwandter Systeme I', Monatshefte für Mathematik und Physik, 38, pp. 173-98. 
(Reprinted and translated in his Collected Works, edited by S. Feferman et al., Oxford 
University Press, 1986, vol. I, pp. 144-95.) 

Hao, W. [1962]: A Survey of Mathematical Logic, Peking, Science Press. 

HrarRT, D. & ACKERMANN, W. [1938]: Principles of Mathematical Logic, second edition. 
Translated by L. M. Hammond, G. G. Leckie, and F. Steinhardt, edited by R. E. Luce, 
1950, New York, Chelsea. 

Hopazs, A. [1983]: Alan Turing: the Enigma, New York, Simon & Schuster. 

KLEENE, S. C. [1952]: Introduction of Metamathematics, Princeton, Van Nostrand. 

Kurene, S. C. [1981]: ‘Origins of Recursive Function Theory’, Annals of the History of 
Computing, 3, pp. 52~67. 

LAMONTAGNE, C. & BEAUSOLEIL, J. R. [1982]: ‘Achieving Visual Spatlality: Towards a 
Psychologically Relevant, Physiologically Plausible, and Computationally Efficient 
Conjecture’, Cognition and Brain Theory, 5, pp. 343-65. 

Minsky, M. L. [1967]: Computation: Finite and Infinite Machines, Englewood Cliffs, N.J., 
Prentice-Hall. 

Popper, K. R. [1968]: Conjectures and Refutations, New York, Harper & Row. 

Popper, K. R. [1972]: Objective Knowledge, Oxford University Press. 

Popper, K. R. [1974]: ‘Autobiography’ and ‘Replies to my Critics’, in Schilpp [1974], pp. 
3-181 and pp. 961-1197. 

Popper, K. R. & Eccrzs, J. C. [1977]: The Self and Its Brain, Springer International. 

Scarpe, P. A. (ed.) [1974]: The Philosophy of Karl Popper, Library of Living Philosophers 
XIV, La Salle, Illinois, Open Court. 

Tuninc, A. M. [1936]: ‘On Computable Numbers, with an Application to the 
Entscheldungsproblem’, Proceedings of the London Mathematical Society, series 2. 
42, pp. 230-65, 43, pp. 544-6. 

TURING, A. M. [1937]: ‘Computability and A-definability’, Journal of Symbolic Logic, 2, 
pp. 153-63. 

TURING, A. M. [1950]: ‘Computing Machinery and Intelligence’, Mind, 59, pp. 433-60. 


Brit. J. Phil. ScL 40 (1989), 323-332 Printed in Great Britain 


Note on Entropy, Disorder 
and Disorganization! 


K. G. DENBIGH 


1 Introduction 

2 Open Systems 

3 The 'Meaning' of Entropy 

4 Entropy, Disorder and Disorganization 


I INTRODUCTION 


The occasion for this note was my viewing a recent TV programme in which 
several scientists, physical as well as biological, took part. I was surprised by 
their expression of the view that the spontaneous formation and evolution of 
living systems is a phenomenon not easily reconcilable with thermodynamics. 
Of course formerly this idea had been widespread; but I had hoped that a 
number of studies, including some of my own, had disposed of the notion that 
any actual contradiction is involved—41.e. between entropy increase, on the 
one hand, and, on the other, certain less rigorously definable processes such as 
might characterise biological evolution. For example increase of 'orderliness', 
increase of 'organization' or of 'complexity'. These terms are far from meaning 
the same thing, but have all been supposed—erroneously in my view—to be 
contraries to the process of entropy increase. 

I shall argue that the Second Law of Thermodynamics, though having an 
important scope, is actually much less strongly restrictive about ‘what may be 
going on in the cosmos' than is widely thought. As it happens the notion of 
‘entropy’ is now much used in literary and journalistic circles. It has become a 
vogue word, displayed on T-shirts, and taken as a measure of everything that is 
supposedly deterlorating or getting worse. These far-fetched uses of the 
entropy concept will not be eradicated until scientists themselves declare them 
to be mistaken. And indeed they amount to something much more than a mere 
misuse of words for they have resulted in scientific error as well, notably in 
biology. 

1 This note is based on a paper given to a B.S.P.S. meeting at Sussex University in September 


1983, and on the book Entropy in Relation to Incomplete Knowledge by K. G. Denbigh and J. S. 
Denbigh, Cambridge University Press [1985]. 


324 K. G. Denbigh 
2 OPEN SYSTEMS 


A preliminary point, needing only a few paragraphs, is that the Second Law 
does not apply directly to ‘open systems’. Let the reader be reminded that 
thermodynamics makes distinction between four classes of systems,” accord- 
ing to the constraints imposed upon them. They are: 


(a) Isolated systems. These do not allow the transfer either of matter or of 
energy across their boundaries. 

(b) Adtabatically isolated systems. Here the transfer of heat (and also of matter) 
across the boundaries is excluded, but not the transfer of other forms of 
energy. 

(c) Closed systems. Here the constraints are further relaxed and only the 
transfer of matter is excluded. 

(d) Open systems. These are those systems, defined by their geometrical 
boundaries, which allow the passage of energy together with the 
molecules of some (but not necessarily all) chemical substances.? 


An abbreviated statement of the Second Law, to the effect that a system's 
entropy can only increase (or remain constant), is directly applicable only to 
systems belonging to classes (a) and (b). Systems in classes (c) or (d) may 
actually undergo decreases of entropy, due to an outwards passage of heat 
and/or of matter across the boundary surface. But of course the Law can still be 
made to apply to a sufficiently enlarged system—1.e. to the sub-system of interest 
together with all that part of its environment such that the total system 
satisfies those conditions which bring it into classes (a) or (b).* It can then 
occur that a decrease of entropy in the sub-system is compensated by an equal 
or greater increase of entropy of the environment. Such behaviour is fully 
consistent with the Second Law. 


? The term 'system' here refers to whatever part of the physical world is being discussed. For 
purposes of simplicity I have taken the terms ‘heat’ and ‘energy’ as being understood prior to 
the defining of the four classes, even though a systematic presentation of thermodynamics 
usually proceeds rather differently. Perhaps it should be added that in the case of those systems 
such as the stars in which there occurs a relativistic interconversion of matter and other forms 
of energy the distinction between openness and.closedness loses its significance. 

In the case of open systems the concept of ‘heat’ must be used with great caution, due to the fact 

that when matter is transferred across the boundary of a system it carries energy with it. 

* Itshould be mentioned that the constraints used for the definition of classes (a) and (b) are not 
the only sorts of constraints used in thermodynamics. Instead of talking about isolated systems 
we can discuss systems which are held at constant volume and temperature by being contained 
in a rigid vessel immersed in a sufficiently large thermostat. The system plus thermostat is an 
instance of an 'enlarged system' as referred to above. Whereas the entropy of the enlarged 
system tends to a maximum, the 'characteristic function' of the sub-eystem held at constant 
volume and temperature is the Helmholtz free energy, and this tends to a minimum. It ts 
important to notice however that the application of this criterion (or the corresponding one in 
terms of the Gibbs free energy when the sub-system is held at constant pressure and 
temperature) continues to be limited to closed systems. There is no such criterion or tendency 
attributable to open systems. 


w 


Note on Entropy, Disorder and Disorganization 325 


Now the vast majority of naturally occurring systems belong to class (d). 
‘Isolation’ and 'closedness' are not natural states of affairs. Consider a few 
examples of open systems. Celestial bodies very slowly lose their atmospheres 
by diffusion of gaseous molecules into space; but they also gain matter in the 
forms of dust and meteorites and they gain radiation from hotter bodies. For 
these reasons it may be difficult to say precisely whether their entropies are 
increasing or diminishing. Living creatures are also a very significant sub-class 
of open systems. For instance an individual cell continuously takes up 
metabolites through its enclosing membrane and this material undergoes 
chemical reactions within the cell interior resulting in a variety of low- and 
high-molecular weight products. Some of these pass out of the cell; others 
contribute to the cell’s growth and to its eventual division. The exchange of 
metabolites between a cell and its environment is essential to the maintenance 
of cell function in all of its aspects. 

In fact quite complicated behaviour can be displayed by open systems 
without such behaviour being in any way contrary to the Second Law. 
Particularly striking is the occurrence of ‘dissipative structures’ as described by 
Prigogine [1980, 1984] and his collaborators. They have described how an 
otherwise improbable structure can become stabilised within an open system 
at the expense of the compensating entropy production due to, say, an energy 
flow through that system. Many experimentally studied examples, of the type 
of the Belousov-Zhabotinskii reaction, have been described in the literature. 

Returning to the case of living creatures it may first be remarked that the 
making of an accurate entropy balance on an organism together with its 
environment (Le. the ‘sufficiently enlarged system’ referred to above) is a 
matter of very considerable difficulty. Nevertheless such experimental evi- 
dence as is available (e.g. Linschitz, 1953) has not revealed any contravention 
of the Second Law. 

What then is meant by those who maintain that the phenomenon of life is 
not easily reconcilable with this law? I think that what they have in mind is a 
rather vague assertion to the effect that (a) organisms are highly ‘orderly’ and/ 
or ‘organized’ systems; (b) ‘orderliness’ and ‘organization’ are inversely related 
to entropy; and therefore (c) organisms are abnormally low entropy systems. 
They may then wish to go beyond (a), (b) and (c) and to assert that organisms 
somehow avoid the Second Law, even after allowing for their 'openness'. Or 
without actually going so far they may wish only to say that highly orderly 
and/or organized systems are exceedingly improbable, that Improbability is 
related to low entropy and that Einstein's fluctuation theory is sufficient to 
show that, even within an open system, a fluctuation sufficient to produce a 
local increase of order and/or of organization of sufficient magnitude ts 
exceedingly unlikely. 

If one or the other of these assertions is a fair statement of what is meant, I 
shall argue that the whole argument is confused and fails because, in fact, 


326 K. G. Denbigh 


there is no necessary connection between entropy and either ‘orderliness’ or 
‘organization’. 


3 THE ‘MEANING’ OF ENTROPY 


Consider a physico-chemical system subject to the constraint that it is closed. 
Let it change during a period of time between its states numbered 1 and 2 
which are determined by a sufficient number of macroscopic variables such as 
energy, volume and composition. In thermodynamics its entropy change is 
defined by the relation 


2 
8$ —8, = f(da/T)..,. (1) 
1 


where dq is the heat intake over any infinitesimal part of the change, T is the 
corresponding temperature (absolute) and the subscript 'rev.' denotes that the 
change is made to occur reversibly—i.e. through a sequence of states none of 
which are displaced more than infinitesimally from equilibrium states.? 

Since (1) applies only to closed systems it leaves no scope for the making of 
an immediate application of the entropy concept to open systems such as living 
creatures. However, as noted above, open systems can often be regarded as 
being embedded in a much larger closed system to which (1) is indeed 
applicable. In fact one can define an entropy flow vector between the open 
system and the closed system which contains it. 

According to thermodynamics entropy is nothing other than it is defined to be 
by (1). Of course entropy has interesting and useful properties but is (1) which 
supplies its meaning. On the other hand statistical mechanics (whose objective 
is to link thermodynamics with atomic and kinetic theory) produces several 
alternative definitions of statistical entropy. None of them is precisely the same 
as the thermodynamic entropy and none of them is identical with the others.® 
Gibbs called them ‘entropy analogues’ and this is a good usage because it helps 
to reduce confusion (a confusion which has subsequently been greatly 
Increased by the emergence of various ‘information entropies’). 

It remains the thermodynamic entropy to which the Second Law refers and 
indeed it has been one of the long-standing problems of statistical mechanics to 
achieve a rigorous proof that any of the analogues display an increasing 
tendency. 


5 The fact that this equation defines only a difference of entropy is of no relevance in this note 
where the tssue ts whether entropy changes are related to changes of ‘orderliness’ or of 
‘organization’. 

* For a discussion on the relationships between thermodynamic entropy and some of the 
statistical entropies see, for example, Penrose [1970]. 


Note on Entropy, Disorder and Disorganization 327 


In the present context the most perspicuous analogue is the one associated 
with the names of Boltzmann and Planck: 


Sap — kInW (2) 


where k is Boltzmann’s constant.’ Spp is the statistical entropy of a closed 
system constrained to a fixed energy, volume and composition (and such other 
macroscopic variables as may be relevant). It corresponds to what Gibbs called 
the ‘micro-canonical ensemble’. W is then the number of independent 
quantum states (energy eigenstates) which are accessible to that system. Spp is 
thus a logarithmic measure of the extent to which the constraints limit the 
accessibility of the eigenstates out of their otherwise infinite number; W is a 
measure of ‘spread’ (Guggenheim, 1949). (It may be added that it is a quite 
unnecessary gloss to assert that either Sgp or Sc is a measure of our ignorance 
about the actual quantum state momentarily occupled by the system.)? 

If an ‘interpretation’ of entropy is desired, Guggenhetm’s notion of ‘spread’ 
seems to be the best candidate. Gibbs himself appears to have toyed with the 
idea that entropy might be understood as ‘mixed-up-ness’.? Whether he would 
have gone ahead with the idea seems problematic, for he was surely aware of 
the existence of counter-examples. Think for instance of a finely divided 
emulsion of oil in water contained in an adiabatically isolated container. As 
time goes on its entropy can only remain constant or increase. Nevertheless 
the ofl and water gradually become unmixed and separate out as two liquid 
layers. The notion of entropy as mixed-up-ness could then only be ‘saved’ by 
using the term in some non-obvious sense—e.g. by resorting to the molecular 
rather than to the macro-level of description. But if so one would surely do 
better to talk about Boltzmann and Planck's W! 

In summary: if one wishes to substantiate a claim or a guess that some 
particular process involves a change of thermodynamic or statistical entropy 
one should ask oneself whether there exists a reversible heat effect, or a change 
in the number of accessible energy eigenstates, pertaining to the process in 
question. If not, there has been no change of physical entropy (even though 
there may have been some change in our ‘information’). 


7 The Gibbstan-style analogue Sa= —kXpinp, where p; is the probability of the !'th energy 
elgenstate, is actually more useful than is Sgp, even though it is less perspicuous. It corresponds 
to the 'canonical ensemble' and the constraint of constant temperature replaces that of 
constant energy. 

8 [n information theory an ‘informational entropy’ ts defined as being a measure of the 
‘uncertainty’ or of the ‘missing information’ relating to whatever entity or event ts under 
discussion. The definition has the same form as that of Sg/k in the previous footnote, but in 
general the probabilities p, do not refer to the same events. Typically they are the probabilities of 
the symbols occurring in a message, and in such cases the ‘informational entropy’ has no 
bearing on the Second Law. Neither is it related to ‘orderliness’, 'organization'.or ‘complexity’. 
The matter Is discussed by Denbigh and Denbigh [1985]. 

? There occurs the title ‘On entropy as mixed-up-ness’ in a list of Gibbs’ unpublished fragments 
[1961]. 


328 K. G. Denbigh 


4 ENTROPY, DISORDER AND DISORGANIZATION 


Following these necessary preliminaries let us proceed to the ‘interpretations’ 
which are the topics at issue—i.e. entropy regarded as disorder or as a measure 
of disorganization. In my view these interpretations are erroneous, or at best 
ambiguous, and in fact there are no pairwise relationships of identity between 
any of the three terms appearing in the above heading. Although I have made 
these points in previous publications I will here present them in rather more 
formal terms. 

It may be remarked first of all that the notions neither of disorder nor of 
disorganization are related at all obviously to the quantities on the right hand 
sides of equations (1) and (2) which are defining equations for thermodynamic 
and statistical entropy respectively. Indeed the 'interpretations' at issue are 
dubious from the very beginning in view of the fact that 'entropy' is a term 
which belongs firmly within science whereas the terms 'order' and 'organiza- 
tion', and their negations, do not. Their meanings are very broad and are 
subject to large variations according to context—e.g. political, legal, etc. 

Let us first try to characterise 'order' and 'orderliness', and their negations, 
in the restricted context of space and time, and then attempt to do the same for 
'organization'. 

Itis a question, I think, of whether or not the set of entities (objects or events) 
under discussion are distributed in space and/or time according to some rule. 
For example a set of three or more objects, A, B, C, etc., will display a certain 
kind of orderliness if they exist in a linear arrangement. Thereby the objects 
obey the rule that B is to the right of A, that C is to the right of B, etc., when 
viewed from anywhere on one side of the line. The same objects will display an 
additional kind of orderliness if there is also a relationship (e.g. of equality, of 
doubling, etc.) between successive separations, AB, BC, etc. There ts then a 
more comprehensive state of order. 

Now natural objects will not usually lie exactly on a straight line, or on any 
other simple geometric figure such as a circle or an ellipse. The arrangement of 
the objects has to be recognized!? as approximating to that figure and the 
question therefore arises: how large a standard deviation (or comparable 
measure) is permissable for a given set of entities to qualify as orderly in some 
particular respect. Indeed one might distinguish between 'order' and 'order- 
liness', the former being taken as referring to an ideal state in which there is 
complete agreement with the rule, and the latter being a measure of the extent 
to which some particular set of real entities approximates to that ideal. 


10 When orderliness has to be recognised this gives emphasis to the view that It often has an 
essentially subjective character. This ts also shown in instances where orderliness depends on 
the choice of a convention—+.g. in the example of that part of an ‘ordered’ sequence of playing 
cards which has been assigned the values A>K>Q>J>10. 


Note on Entropy, Disorder and Disorganization 329 


The foregoing points can now be considered in relation to entropy by taking 
the example of a crystalline solid. There is certainly no difficulty in 
appreciating that a crystal is an orderly arrangement of its component 
particles in so far as these lie at positions close to the intersection points of a 
geometric lattice. The question is: How orderly is it? For the supposed 
interpretation of entropy as orderliness is useless unless the concept of 
orderliness is capable of being made at least as quantitative as is that of entropy 
itself, and this has not been achieved in a relevant scientific manner. 

To see that this is so, recall that W in equation (2) is calculable, at least to 
high approximation, for the case of a perfect lattice, and it is also calculable 
after allowing for various kinds of crystal imperfections. But there is no theory 
of orderliness independent of statistical mechanics which would provide the 
means of calculating a crystal's degree of orderliness, and which would 
therefore permit the claim: "Behold, we have shown that W, and thus 
statistical entropy, is an exact measure of disorder!!! 

The error or ambiguity which is involved in the identification of entropy 
with disorder is well illustrated by an example I have used previously: the 
spontaneous crystallisation of a super-cooled melt. Under adiabatic conditions 
the entropy of this system increases, but it would involve special pleading to 
substantiate a claim to the effect that its disorder also increases! 

Similar considerations apply in other contexts. In my view some of the many 
discussions in the literature on the evolution of the universe from the Big Bang 
onwards have been weakened by attempts to apply the notions of 'chaos' and 
'disorder'—and also ‘uniformity’—as if these were equivalent to using the 
Second Law. 

Let us turn to a consideration of ‘organization’. It is the supposed relation of 
'disorganization' to entropy increase which is my primary interest in this note 
because of its presumed blological significance. 

In the biological literature the terms 'order' and 'organization' are often 
used interchangeably as if they were synonyms. That this is not the case is 
shown by the existence of counter-examples. For instance a patterned 
wallpaper is surely much more orderly, in having almost exact repetition, than 
is, say, a Cezanne, but is much less highly organized. Any great painting 
displays organization to a high degree but its parts are not related to each other 
by a rule, such as is characteristic of a state of order. Similarly a living cell is 
more highly organized than is a crystal even though the latter is much more 
orderly, at least in the spatio-temporal context. The existence of these pairs of 
entities which display the qualities in question in reverse order of ranking is 
sufficient to show that these qualities cannot be the same. 


11 An interesting mathematization of the concept of order has been put forward by Bohm [1980] 
but tt does not provide what is needed in the present context. Nevertheless in his [1987] he 
shows the extent to which a probabilistic treatment of the implicate order links up with 


Prigogine's theory of entropy. 


330 K. G. Denbigh 


So far so good but the actual meaning of ‘organized’, as tacitly understood in 
the last paragraph, is very difficult to specify, as has been recognised and 
discussed by a number of philosophical btologists.}? 

Consider some examples where the concept 'organized' is manifestly 
applicable. Just as the items depicted in a painting are spatially organized, so 
also the notes and bars in a musical composition are temporally organized. 
Similarly the component parts of a scientific theory or a mathematical treatise 
are organized in a logical space. 

Businesses and other sorts of institutions are exemplary cases and they bring 
out once again the distinction between 'organized' and 'orderly'. A diagram 
representing the structure of a business may be such that it has none of the 
attributes of orderliness, such as symmetry and regularity; and yet it clearly 
represents an organization in so far as it shows the flow of work from one part 
of the business to another, and thus it relates to time as well as to location. 

There are many other instances of systems which are temporally, as well as 
spatially, organized. For example in the living body the various processes 
(chemical reactions, diffusions, etc) are clearly coordinated with each other in 
time, as well as in space. Yet other examples of organized entities are skilled 
performances, such as cycling, piano playing, using speech, and so on, and 
also mechanical and electrical structures such as cars, computers, etc. And 
indeed all things belonging to what Popper calls World 3. 

In several of these examples, especially those relating to physical structures, 
it may appear that the notion of ‘organized’ is much the same as that of 
‘complexity’. Yet there is an important difference. A computer is complex but if 
a few of its internal connections were to be broken it could no longer compute 
and, in my view, is then no longer to be deemed an organization, even though 
it remains complex. 

In short I believe that the most essential aspect of what is meant by an 
organized system is that the system in question has a function; it can do 
something or can be used to do something. As examples a mathematical 
treatise can be used to derive some further results and a legal system can be 
used to achieve justice and orderliness in society (whilst not being 'orderly' in 
itself). Even so this notion of function remains ambiguous when it is applied to 
living things. It would be altogether too vague to claim that the function of an 
organism is to survive and to reproduce. AsI have remarked elsewhere (1981): 
'... an organism cannot be regarded as an assembly of previously known 
components put together for the purpose of achieving a specific objective . . . 
the “functions” of an organism are read into it on the basis of hindsight . . .'. 


12 For a review with references see Peacocke [1983]. Bohm [1980, 1987] uses the term 
‘organized’ as meaning the ‘working together’, in a coherent way, of all aspects of a structure. 


Note on Entropy, Disorder and Disorganization 331 


Having thus briefly sketched what I believe to be the accepted usage of the 
term ‘organized’ let me turn to the third possible equivalence—i.e. that which 
might exist between entropy and disorganization. That this does not in fact 
exist can be seen from the occurrence of counter-examples. Think for instance 
of a fertile bird's egg inside an incubator. The latter contains a sufficiency of air 
and was initially raised to a temperature high enough for the hatching of the 
egg. The incubator was thereafter surrounded by perfect thermal insulation 
with the consequence that its total entropy can only increase or remain 
constant. However there remain two possibilities concerning a different aspect 
of the system’s temporal development: (1) the egg dies; (2) the egg lives and 
eventually gives rise to a live chick. Now it is true that in case (1) there is an 
entropy increase accompanied by a process of disorganization, localised in the 
egg. But the opposite is the situation in case (2): For although the egg is 
certainly a highly organized system, the live chick must surely be deemed to be 
much more so. Entropy again increases but now there is an increase in the 
degree of organization as well. This example thus provides a clear instance of 
its being false to suppose that entropy increase is equivalent to a process of 
disorganization. 

Notice that we habitually speak of some particular entity as being more or 
less highly organized than another. This suggests that we have an intuitive 
concept of amount or degree of organization, as was tacitly assumed in the 
previous paragraph. If so, examples such as that of the egg dying or hatching 
indicate that ‘amount of organization’ is not conserved—te. it need not remain 
constant in time. Of course this is not to say that organisms operate in a 
manner contrary to the Second Law. That is not the case at all. The irreversible 
processes of metabolism, heat conduction etc., occurring within organisms are 
entropy producing like any others. It is only to say that changes in ‘amount of 
organization’ and of entropy can occur quite independently of each other. 

A similar conclusion was reached earlier in this note about changes of 
‘orderliness’ and of entropy being mutually independent. This bears out what 
was claimed at the beginning—i.e. that the Second Law is a good deal less 
restrictive than is commonly supposed. In addition to entropy there may well 
exist other ‘one-way functions’!? which add to the overall description of the 
world's temporal development. 

` King's College, London 


13 perhaps, for instance, concerning the total amount of organization in the planetary blospheres. 
Of course such possibilities do not imply vitalism. That there can occur entirely natural 
processes of self-assembly (e.g. of collagen stacks) was described by Fox and Dose [1972] and 
they went on to develop their ‘constructionist’ theory of natural evolution. This supposes that 
separate ‘components’ come together through the action of natural forces and the resulting 
aggregate is then able to engage In certain physico-chemical ‘functions’ on which natural 
selection can operate. 


Brit. J. PhiL Sci. 40 (1989), 333-356 Printed in Great Britain 


Pragmatic Truth and the Logic 
of Induction 


NEWTON C. A. DA COSTA AND STEVEN FRENCH 


ABSTRACT 


We apply the recently elaborated notions of ‘pragmatic truth’ and ‘pragmatic 
probability’ to the problem of the construction of a logic of inductive inference. It is 
argued that the system outlined here is able to overcome many of the objections usually 
levelled against such attempts. We claim, furthermore, that our view captures the 
essentially cumulative nature of science and allows us to explain why it is indeed 
reasonable to accept and believe in the conclusions reached by inductive inference. 


1 Introduction 
2 Pragmatic Truth 
2.1 ‘Pragmatic’ and ‘Absolute’ Truth 
2.2 Simple Pragmatic Structures 
2.3 The Definition of Pragmatic Truth 
3 Inductive Arguments 
3.1 General Considerations 
3.2 Forms of Inductive Argument 
3.3 The Generalized Hypothetico-Deductive Method 
4 Pragmatic Probability 
4.1 Subjective Probability 
4.2 ‘Probabilizing’ Induction 
4.3 Pragmatic Propositions 
5 Inductive Logic 
5.1 Introduction 
5.2 ‘Bayes’ Principle’ 
5.3 Simple Induction and Analogy 
6 Philosophical Discussion 
6.1 Characteristics of our System 
6.2 The Problem of Universal Laws 
6.3 The Problem of a priori Probabilities 
6.4 The 'Hacking Problem' 
6.5 Induction and Theory Change in Science 


334 Newton C. A. da Costa and Steven French 


I INTRODUCTION 


Inductive arguments, in their most general form, may be described as 
inferences which are regarded as reasonable but which are not valid, in the 
strict deductive sense. Despite the apparent prevalence of such arguments in 
both sclence and everyday life, many philosophers have concluded that the 
idea of an inductive logic is some kind of ‘philosophers invention’, a ‘make- 
believe’ theory which has mired scientific methodology in a morass of 
philosophical difficulties (van Fraassen [1985], pp. 258-81 and pp. 294-6, for 
example). 

Our intention in this paper is to elaborate on a system of inductive logic first 
laid down by da Costa [1987], which is based on a ‘pragmatic’ interpretation of 
the probability calculus and the concept of ‘pragmatic’ or 'quasi-' truth. By 
way of a probabilistic characterization of inductive inferences, we hope to 
show that this system can overcome the most important of the objections 
usually levelled against such attempts in general and against the Carnapian 
and Bayesian approaches in particular. 

Thus, in Section 2 below, we present a brief summary of the formalism of 
‘pragmatic’ truth. The fundamental notion here is that of a ‘partial structure’, 
that is, models in which the set of relations is not completely defined for all 
elements of the model. It is this latter aspect which opens the door to fallibilism 
and allows us to talk of ‘pragmatic’, ‘quasi-’ or ‘approximate’ truth (for the 
purposes of our discussion these three terms will be taken to be synonymous). 
We admit that this is a somewhat novel concept but we believe it to be an 
extremely fruitful one, with applications not only as regards the philosophy of 
truth and foundations of probability theory but also in the model-theoretic 
approach to the philosophy of science in general (see da Costa and French 
[198?a]. 

In Section 3 the attempt to construct a logic of induction is defended in 
general terms and we emphasize the tentative nature of inductive inferences. It 
is precisely the latter, we believe, which can be captured by the notion of 
‘pragmatic’ truth. Thus, in Section 4 we outline de Costa's theory of 
‘pragmatic’ probability based on this latter notion. By ‘probabilising’ inductive 
inferences we claim that their plausibility can be most adequately evaluated by 
means of 'pragmatic' probabilities, whether quantitative or qualitative. 

This then leads to the heart of the work, Section 5, in which we present our 
system of Inductive logic. Its most important characteristics are that it is 
tentative, local and instrumental. Tentative in that inductive inferences aim 
only for pragmatic truth; local in the sense that the application of the 
pragmatic probability calculus is circumscribed by the relevant conditions of 
the situation under consideration; and instrumental in the sense that 
induction should be regarded as merely a device for arriving at pragmatic truth 
and our system is only one of a number of possibilities. 


Pragmatic Truth and the Logic of Induction 335 


Finally, in Section 6 we suggest how our approach can overcome some of 
the standard objections to such systems and we indicate certain consequences 
with regard to the question of theory change In sclence in general. 

We would like to conclude this introduction with a kind of 'statement of 
intent’. Our aim in this paper is essentially two-fold: first of all to present a 
possible system of inductive logic, based on the notion of ‘pragmatic’ truth as 
formalized in terms of partial structures; and secondly, to indicate how our 
system might overcome some of the problems which usually bedevil such 
attempts and to note some of the more interesting connections between our 
approach and other views in this area. We obviously do not intend our work to 
be anything like the last word on the matter but hope that we can show that, at 
the very least, it represents an interesting and possibly fruitful line of 
development. 


2 PRAGMATIC TRUTH 
2.1 ‘Pragmatic’ and ‘Absolute’ Truth 


In this section we briefly treat the concept of pragmatic truth (for details, see 
Mikenberg et al. [1986], da Costa [1986], and da Costa and Chuaqui [198?b]). 
Although the motivation for this concept came from a consideration of the 
views of certain pragmatists, such as C. S. Peirce, we do not claim that it reflects 
well the stance of any adherent of pragmatism. For this reason, among others, 
we also call the kind of truth to be defined below 'quasi-' or approximate truth. 
Anyway, it will be the foundation stone of our system of inductive logic. 

It is obvious that scientific theories, hypotheses and laws should not be 
accepted by the sclentist as absolutely true (in the sense of the correspondence 
theory of truth) but only as approximately true. Well established theories, such 
as Newtonian mechanics, for example, previously considered as absolutely 
true, were subsequently shown to be false. However, these theories continue to 
be employed in some domains as if they were true and, surprisingly enough, 
they work. It seems clear that, taking into account the lessons of the history of 
science, no theory or hypothesis should aspire to be absolutely true, although 
it may remain useful, being approximately true, within certain restrictive 
conditions. 

What happens in science happens also in the field of technology and in 
everyday life: some of our inductions, although strictly false, work since things 
occur as if they were true. 


2.2 Simple Pragmatic Structures 

The main problem now, of course, is to make the above ideas rigorous. To 
begin with, let us suppose that we are investigating a given domain of 
knowledge A of the empirical sciences, such as, for example, human genetics, 
the branch of biology concerned with human heredity. Our starting point is 


336 Newton C. A. da Costa and Steven French 


then the set of individuals composed of human beings. We are interested in 
certain traits which are transmitted by biological inheritance among people. 
The set of human beings, the individuals of our domain A, we denote by £1. 
The relevant, actual, properties and relations among the members of f1, we 
model by a family of partial relations (unary, binary, . ..) &, ief; they are 
partial because any relation &,, ief, of arity n, is not necessarily defined for all 
nrtuples of elements of of). We envisage such relations as expressing what we 
know or accept as true in connection with the actual relations linking the 
members of £. Summarizing then, the partial structure (fi, B cs 
constitutes a schematic model of what is known or accepted as true about the 
actual structure of our domain A. 

However, this structure is not strong enough to exactly mirror the domaln 
under consideration. It is convenient and even necessary to enrich our 
structure through the introduction of new ideal individuals (chromosomes, 
genes, etc... .) and new partial relations among the extended set of partial 
relations so obtained (genetic maps, intersexuality's chromosomal influences, 
etc... .), If we want to cope naturally with the problems originating in A. This 
collection of new objects we denote by f2 and the family of new relations, 
some of which may extend the old ones, we represent by 4, Je, . Of course, we 
suppose that o,0%2=¢, .$0-4, and we put sf -—.-fi1usf, and 
AX —.*U f. In addition, we denote by Z the language in which we talk about 
the structure (o£, Ayer. 

There exists a set of distinguished sentences (closed formulas) of & which 
we assume as being true or that are really true (according to the 
correspondence theory of truth). Among these distinguished sentences 
there are true decidable sentences, i.e. true sentences whose truth or 
falsehood can be decided, at least in principle (observation statements, for 
instance) and also certain general sentences, encompassing laws and theories 
assumed as being strictly true. We denote by JP the collection of these 
distinguished sentences. 

Thus in order to obtain a model of A we are led to a structure of the following 
form: 

A= (A A 2,0, Ry, BP dies, ES 


the sets of; and sf», the families of relations æ, ief and &,, je 7, and the set of 
sentences IP satisfying the above conditions. Such a structure will be called a 
simple pragmatic structure (sps). Its role in the theory of pragmatic truth is 
analogous to that of the common set-theoretic structures in Tarski's theory of 
truth (see Mikenberg et al. 1986). Furthermore, the model theory based on the 
concept of a sps extends extant model theory. 

To simplify our exposition and employing the above symbols, we note that 
the notion of a sps can be redefined as follows: a sps is a structure 


3I— Cof, Ry, Pier 


Pragmatic Truth and the Logic of Induction 337 


where £ is a non-empty set, 4&, is a partial relation defined on sf for every 
ke’, and TP is a set of sentences of a language ¥ of the same similarity type as 
that of VÍ and which is interpreted in U. For some k, Z, may be empty; IP may 
also be empty. (In the general case, partial functions could also be included in a 
sps). 

Let us be more explicit. We suppose that ¥ ts a first-order language with 
equality, but without function symbols. We say that is interpreted in a sps 
9I— (af, Sl, Pix of the same similarity type as that of M, if: 


1. each individual constant of Z is associated with an element of the 
universe xf or UW; and 

2. each predicate symbol of ^ of arity n is associated with a relation &,, 
kex, of the same arity n, and this last association is surjective. 


2.3 The Definition of Pragmatic Truth 


We are now in a position to formulate the two basic definitions of the theory of 
pragmatic truth (or quasi-truth or ‘approximate’ truth). They make rigorous 
the informal concept of pragmatic truth, as well as allowing us to mathematize 
it. (Our set-theoretic treatment of pragmatic truth is obviously analogous to 
Tarski's treatment of the classical conception of truth). Our definitions are the 
following (see da Costa and Chuaqui 198?b): 

Definition 1:—Let L and 91— (sf, Ry, P)iex be respectively a language 
and a sps, such that ¥ is interpreted in U. Let ¥ be a total structure, whose 
relations of arity n are defined for all n-tuples of elements of its universe, and we 
suppose that ¥ is also interpreted in Y. Then .? is said to be 9I-normal if the 
following properties hold: 

1. The universe of Z is of; 

2. The (total) relations of £ extend the corresponding partial relations of 9I; 

3. If cis an individual constant of Z, then in both N and .? c is interpreted 

by the same element; 

4. If acTP then, Zea. 


It may happen that a sps 9f be such that there are no 9f-normal structures, 
but a necessary and sufficient condition for the existence of such structures is 
formulated in Mikenberg et al. 1986. Here we suppose that our spss satisfy this 
condition. 

Definition 2:—Y and I will denote respectively a language and a sps in 
which Z is interpreted. We say that a sentence a of ¥ is pragmatically true in 
the sps I according to Z, if Wis a sps, Y is an W-normal structure and a is true 
in X- (in conformity with the Tarskian definition of truth); that is, we say that 
«1s pragmatically true in the sps M if there exists an W-normal ¥ in which a is 
true (in the Tarskian sense). If æ is not pragmatically true in the sps W 
according to ¥ (is not pragmatically true in the sps 91), we say that « is 


338 Newton C. A. da Costa and Steven French 


pragmatically false in the sps 9I according to ¥ (is pragmatically false in the 
sps W). 

We claim that our rigorous definition of pragmatic truth captures the gist of 
the informal notion of a proposition « being such that everything occurs in a 
domain A as if it were true (cf. Mikenberg et al. [1986] and da Costa [1986]); 
that is, « saves the appearances in A. Evidently « is pragmatically true or 
pragmatically false only in a given sps, within more or less definite conditions 
(certain limits of precision, determined ranges of variation of the relevant 
quantities, etc. . . .). On the other hand, a is quasi-true if it is not incompatible 
with any element of P; furthermore, « will often allow us to predict future 
experience. 

Sometimes the set IP may be taken to include all true observation statements 
in A; at others, we may restrict it to contain only those statements tested to be 
true up until a fixed time. Thus a temporal aspect may be adjoined to our spss. 
This is an tmportant point since pragmatic truth is commonly considered as 
consisting in the long term (past, present and future) saving of appearances. 

Of course, contradictory propositions (or incompatible theories) may be 
simultaneously quasi-true (thus suggesting a possible resolution of that 
realist’s bugbear, the problem of the empirical underdetermination of 
theories). This may happen in the same sps, as is clear, or in the same domain 
A, either because the propositions are quasi-true in the same sps employed to 
cope with certain problems related to A, or because they are quasi-true in two 
different spss which constitute two different kinds of models of A, aiming for 
distinct objectives. As we shall see, one ofthe objectives ofthe inductive logic to 
be constructed here is precisely to handle this situation—one which is 
inconceivable from the point of view of classical inductive logic and indeed 
from the point of view of the classical methodology of science in general. 

In our opinion, pragmatic truth is 'the' way of theoretical science. We 
believe that science develops through the construction of pragmatically true 
theorles based on spss which are each time more and more appropriate and 
whose final goal is perhaps classical truth. Moreover, each step in this 
endeavour does not destroy the preceeding one and quasi-truth and truth, or 
'complete' truth, may occasionally coincide. To put it another way, we firmly 
believe that the progress of sclence, viewed from the standpoint of pragmatic 
truth, is cumulative. 

(Following the general and specific criticisms of Kuhn and Feyerabend's 
views, this is a position which, in one form or another, seems to have come 
back into favour in recent years, although we do not intend to defend it in 
detail here.) 


3 INDUCTIVE ARGUMENTS 


3.1 General Considerations 
Inferences or reasonings are linguistically expressed by arguments. Any 


Pragmatic Truth and the Logic of Induction 339 


argument is composed of a set of premisses and a conclusion. An argument is 
valid if the conclusion cannot be false when the premisses are true; otherwise it 
is said to be invalid. Valid arguments are the object of study in deductive logic. 

In both everyday life and science we often make inferences considered to be 
reasonable but which are strictly speaking invalid. Such Inferences and the 
corresponding arguments are called inductive (or inductions). Our definition 
of induction, then, is very broad, including all kinds of non-demonstrative 
inferences. Inductive logic is concerned with Just these inferences. 

That induction is quite often used in ordinary life is so obvious that we think 
it unnecessary to insist on this fact. Without induction the human species 
would already have disappeared from the face of the earth; in the all-important 
fleld of reasoning, deductive logic is not enough to guarantee success in the 
struggle for life. Likewise, induction (in particular in the form of analogy) 
shows itself to be imprescindible in the domains of technology and applied 
science (engineering, medicine, etc. . . .). 

However, as is very well known, many people deny the relevance of 
induction for science. Science, according to them, is simply not inductive. In 
general such philosophers either base their opinions on a too restrictive 
conception of what induction is, or their views, when closely examined, are 
seen to be unacceptable. We do not intend to enter into details on this point 
(see, for example, Salmon [1978] pp. 10-12) suffice to say that in everyday life, 
science and technology we use certain inductive techniques in order to, among 
other things, make predictions, forecast future experience, etc. Induction 
therefore imposes Itself upon us as a significant actuality and some form of 
inductive logic should be developed in order to investigate and systematize it. 


3.2 Forms of Inductive Argument 


Among the various forms of induction we can distinguish the following (the 
list is not exhaustive of course): 


. Induction by simple enumeration; 

. Induction based on Mill's canons (elimination); 

. Analogy; 

. Direct inference (from the frequency of an attribute in the parent 
population to the frequency of the same attribute in a sample of that 
population (cf. Carnap [1963]); 

5. Indirect inference (from sample to population (cf. Carnap [1963])); 

6. Predictive inference (from sample to sample (cf. Carnap [1963])); 

7. The strict hypothetico-deductive method; 

8. The generalized hypothetico-deductive method; 

9 

0 

1 


A whe 


. The statistical syllogism; 
. The argument of authority; 
. Inference from perception; 


340 Newton C. A. da Costa and Steven French 


12. Inference from memory; 
13. Testimony; 
14. Statistical inference in general. 


In the remainder of this section, and in the next three, small Greek letters 
will represent propositions and capital Greek letters will denote sets of 
propositions. The symbols >, ^, v, 7, and V, will be employed as 
abbreviations for (material) implication, conjunction, disjunction, negation, 
and the universal quantifler respectively. 

An argument may be conveniently expressed as follows: 


Als 42, .. . s n 
a 


where 05,05,..., % are the premisses and « is the conclusion. When 
(01, @2,..., &)—->a is logically true, the argument is valid, otherwise it is 
invalid. 

We know that a valid argument is unconditionally valid, in the sense that its 
validity depends only on the meanings of its premisses and conclusion, other 
circumstances being irrelevant. On the other hand, an inductive inference is 
made in the presence of a set of pertinent conditions, which confer more or less 
plausibility upon it. Equivalently we can say that the plausibility of an 
induction, Le., the plausibility of the conclusion, as supported by the premisses, 
is also a function of certain extra conditions, which we call side conditions. 
Thus, the logical treatment of a given form of induction basically reduces to the 
inventory and analysis of its (pertinent) side conditions which, when added to 
the premises, contribute to the likelihood of the conclusion. 

Hence, there are two basic differences between deduction and induction: 
deduction is truth-preserving, while induction is not, and deduction does not 
depend on side conditions, while the latter are essential to induction. 
Furthermore, from our point of view, there exists a third fundamental 
distinction between these two forms of reasoning: to estimate induction even 
In a rough, qualitative manner, it 1s convenient, as will be shown below, to 
consider as its primary task the search for pragmatic or quasi-truth, and not of 
truth tout court; that is, it arrives at truth only indirectly or in special cases. 

Evidently, then, in the case of tnductive inferences we must modify our 
scheme above in order to accommodate the set of side conditions T. T consists 
of the propositions expressing the pertinent system of knowledge, as well as 
those specific bits of evidence which contribute to the plausibility of the 
inference. As the standard texts of logic and of statistics make clear, the set T ts 
of the utmost importance; for example, if we are dealing with indirect 
inference, the sample examined must be random, reasonably large and 
sufficiently representative of the parent population and all relevant facts 
relative to it must be critically assessed, etc. Consequently, an induction may 
be appropriately represented by the following schema: 


Ca 


340 Newton C. A. da Costa and Steven French 


12. Inference from memory; 
13. Testimony; 
14. Statistical inference in general. 


In the remainder of this section, and in the next three, small Greek letters 
will represent propositions and capital Greek letters will denote sets of 
propositions. The symbols >, ^, v, 7, and V, will be employed as 
abbreviations for (material) implication, conjunction, disjunction, negation, 
and the universal quantifler respectively. 

An argument may be conveniently expressed as follows: 


Als 42, ... s n 
a 


where 0, %2,..., % are the premisses and œ is the conclusion. When 
(01, @2,..., &)—-a is logically true, the argument is valid, otherwise it is 
invalid. 

We know that a valid argument is unconditionally valid, in the sense that its 
validity depends only on the meanings of its premisses and conclusion, other 
circumstances being irrelevant. On the other hand, an inductive inference is 
made in the presence of a set of pertinent conditions, which confer more or less 
plausibility upon it. Equivalently we can say that the plausibility of an 
induction, Le., the plausibility of the conclusion, as supported by the premisses, 
is also a function of certain extra conditions, which we call side conditions. 
Thus, the logical treatment of a given form of induction basically reduces to the 
inventory and analysts of its (pertinent) side conditions which, when added to 
the premises, contribute to the likelihood of the conclusion. 

Hence, there are two basic differences between deduction and induction: 
deduction is truth-preserving, while induction is not, and deduction does not 
depend on side conditions, while the latter are essential to induction. 
Furthermore, from our point of view, there exists a third fundamental 
distinction between these two forms of reasoning: to estimate induction even 
in a rough, qualitative manner, it is convenient, as will be shown below, to 
consider as its primary task the search for pragmatic or quasi-truth, and not of 
truth tout court; that is, it arrives at truth only indirectly or in special cases. 

Evidently, then, in the case of tnductive inferences we must modify our 
scheme above in order to accommodate the set of side conditions T. T consists 
of the propositions expressing the pertinent system of knowledge, as well as 
those specific bits of evidence which contribute to the plausibility of the 
inference. As the standard texts of logic and of statistics make clear, the set T ts 
of the utmost importance; for example, if we are dealing with indirect 
inference, the sample examined must be random, reasonably large and 
sufficiently representative of the parent population and all relevant facts 
relative to it must be critically assessed, etc. Consequently, an induction may 
be appropriately represented by the following schema: 


Ca 


340 Newton C. A. da Costa and Steven French 


12. Inference from memory; 
13. Testimony; 
14. Statistical inference in general. 


In the remainder of this section, and in the next three, small Greek letters 
will represent propositions and capital Greek letters will denote sets of 
propositions. The symbols >, ^, v, 7, and V, will be employed as 
abbreviations for (material) implication, conjunction, disjunction, negation, 
and the universal quantifler respectively. 

An argument may be conveniently expressed as follows: 


Als 42, .. s n 
a 


where 25,05,..., % are the premisses and œ is the conclusion. When 
(01, @2,..., &)—-a is logically true, the argument is valid, otherwise it is 
invalid. 

We know that a valid argument is unconditionally valid, in the sense that its 
validity depends only on the meanings of its premisses and conclusion, other 
circumstances being irrelevant. On the other hand, an inductive inference is 
made in the presence of a set of pertinent conditions, which confer more or less 
plausibility upon it. Equivalently we can say that the plausibility of an 
induction, Le., the plausibility of the conclusion, as supported by the premisses, 
is also a function of certain extra conditions, which we call side conditions. 
Thus, the logical treatment of a given form of induction basically reduces to the 
inventory and analysis of its (pertinent) side conditions which, when added to 
the premises, contribute to the likelihood of the conclusion. 

Hence, there are two basic differences between deduction and induction: 
deduction is truth-preserving, while induction is not, and deduction does not 
depend on side conditions, while the latter are essential to induction. 
Furthermore, from our point of view, there exists a third fundamental 
distinction between these two forms of reasoning: to estimate induction even 
In a rough, qualitative manner, it 1s convenient, as will be shown below, to 
consider as its primary task the search for pragmatic or quasi-truth, and not of 
truth tout court; that is, it arrives at truth only indirectly or in special cases. 

Evidently, then, in the case of tnductive inferences we must modify our 
scheme above in order to accommodate the set of side conditions T. T consists 
of the propositions expressing the pertinent system of knowledge, as well as 
those specific bits of evidence which contribute to the plausibility of the 
inference. As the standard texts of logic and of statistics make clear, the set T ts 
of the utmost importance; for example, if we are dealing with indirect 
inference, the sample examined must be random, reasonably large and 
sufficiently representative of the parent population and all relevant facts 
relative to it must be critically assessed, etc. Consequently, an induction may 
be appropriately represented by the following schema: 


Ca 


340 Newton C. A. da Costa and Steven French 


12. Inference from memory; 
13. Testimony; 
14. Statistical inference in general. 


In the remainder of this section, and in the next three, small Greek letters 
will represent propositions and capital Greek letters will denote sets of 
propositions. The symbols >, ^, v, 7, and V, will be employed as 
abbreviations for (material) implication, conjunction, disjunction, negation, 
and the universal quantifler respectively. 

An argument may be conveniently expressed as follows: 


Als 42, e. s On 
a 


where 25,05,..., % are the premisses and œ is the conclusion. When 
(01, @2,..., &)—->a is logically true, the argument is valid, otherwise it is 
invalid. 

We know that a valid argument is unconditionally valid, in the sense that its 
validity depends only on the meanings of its premisses and conclusion, other 
circumstances being irrelevant. On the other hand, an inductive inference is 
made in the presence ofa set of pertinent conditions, which confer more or less 
plausibility upon it. Equivalently we can say that the plausibility of an 
induction, Le., the plausibility of the conclusion, as supported by the premisses, 
is also a function of certain extra conditions, which we call side conditions. 
Thus, the logical treatment of a given form of induction basically reduces to the 
inventory and analysis of its (pertinent) side conditions which, when added to 
the premises, contribute to the likelihood of the conclusion. 

Hence, there are two basic differences between deduction and induction: 
deduction is truth-preserving, while induction is not, and deduction does not 
depend on side conditions, while the latter are essential to induction. 
Furthermore, from our point of view, there exists a third fundamental 
distinction between these two forms of reasoning: to estimate induction even 
In a rough, qualitative manner, it 1s convenient, as will be shown below, to 
consider as its primary task the search for pragmatic or quasi-truth, and not of 
truth tout court; that is, it arrives at truth only indirectly or in special cases. 

Evidently, then, in the case of tnductive inferences we must modify our 
scheme above in order to accommodate the set of side conditions T. T consists 
of the propositions expressing the pertinent system of knowledge, as well as 
those specific bits of evidence which contribute to the plausibility of the 
inference. As the standard texts of logic and of statistics make clear, the set T ts 
of the utmost importance; for example, if we are dealing with indirect 
inference, the sample examined must be random, reasonably large and 
sufficiently representative of the parent population and all relevant facts 
relative to it must be critically assessed, etc. Consequently, an induction may 
be appropriately represented by the following schema: 


Ca 


344 Newton C. A. da Costa and Steven French 


decidable, into a decidable proposition «’; in effect, it suffices to take a" as the 
proposition that says,'essentially, that f is quasi-true during some time 
interval. Through this device, we can attribute pragmatic probabilities even to 
non-decidable propositions; when the time interval is large enough we can 
identify « and «’. Of course this strategy involves an idealization, but 
statements expressing scientific hypotheses and theories are such that by this 
artifice we can confer upon them non-zero prior probabilities (we shall return 
to this point later). This is because a good sctentific hypothesis or theory which 
is pragmatically true is a sps W, which codifles certain aspects of a domain A, is 
also absolutely pragmatically true, when 9I is well chosen. For example, 
classical mechanics, Bohr's atomic theory and Maxwell's electrodynamics are 
all quasi-true in convenient limited areas, and will remain so for the time 
being; we are allowed therefore to attribute to them the pragmatic probability 
1. 

When we are attributing pragmatic probabilities to certain statements fl;, 
fa... » Bn, it is natural to deal with the corresponding pragmatic propositions 
03, 02, .. ., %, and then to close the set (a1, @2,..., €n} by the operations 
defined by the usual group of connectives. The propositions thus obtained 
constitute essentially a Boolean algebra and by means of this the pragmatic 
probability calculus can be embedded in the ordinary subjectivistic probability 
calculus (for further details concerning topics treated in this section see da 
Costa [1986]). 


5 INDUCTIVE LOGIC 
5.1 Introduction 


On the basis of the above foundations we are now able to construct our system 
of inductive logic, following the outline previously given in da Costa [1987]. 

We have argued above that to probabilize induction is, in a certain 
important sense, to probabilize the general hypothetico-deductive method and 
that what we are essentially concerned with in such a procedure are pragmatic 
probabtlities. Indeed, given the points previously noted, it seems to us that this 
is the only secure probabilistic basis for a system of this type. Nevertheless, our 
construct is in a certain sense only minimally different from the intuitive 
informal logic of inductive inference, since in many of the relevant cases 
pragmatic truth and ‘absolute’ truth in fact coincide (this ‘minimal’ difference 
is all important of course). 


5.2 ‘Bayes’ Principle’ 
We begin by noting that, given what we have Just said, the inductive argument 


C1, 02 . . s An E 
a 


Pragmatic Truth and the Logic of Induction 345 


means that from the quasi-truth of o, a2,..., % and of the underlying 
conditions T, we infer the quasi-truth of «. Thus the plausibility of inductive 
inferences are judged with respect to the quasi-truth of the statements 
concerned. Furthermore, in order to make the application of the hypothetico- 
deductive method more reliable, overall, several hypotheses should initially be 
formulated and compared (cf. the arguments of Salmon [1968] and Shimony 
[1970], which may be adapted to our account). Thus, in certain cases, for 
example, we compare a particular hypothesis with the negation of the 
statement expressing the quasi-truth of the hypothesis. 

The primary instrument of such comparative procedures is, of course, 
Bayes’ Theorem, applied to pragmatic probabilities. As is well known, this 
provides an effecttve mechanism for describing changes in probability 
attributions and indeed it might be said that the essence of the hypothetico- 
deductive method in general lies in exactly such transformations from prior to 
posterior probabilities (cf. Shimony [1970], p. 85). Bayes’ Theorem also serves 
as a basis for the following rule, which can be called ‘Bayes’ Principle’: 

Let 0&3, @2,..., 4, be pragmatic propositions whose truth is involved in an 
Investigation connected with a pragmatic structure 9I, which systematizes a 
determined domain of knowledge A, and let us further suppose that every o, 
1<i<n, has a prior probability different from zero (ifo asserts that f is 
pragmaticaly true in M, 1<i<n, then fl, f;,.... f, are the pertinent 
hypotheses which save the appearances). Then, given a new plece of evidence 
a which is also a pragmatic proposition, we should (temporarily) accept the o, 
1<i<n, whose posteriori probability is the highest. In many cases we have 
simply n=2, with o? being n. 

Two important considerations should be noted at this point. The first is that 
as well as the conditional probability of the evidence relative to each 
hypothesis, we also obviously have to be able to estimate the prior probabilities 
of the hypotheses being compared and this estimation does not take place ‘in 
vacuo', as it were. That is, the prior probabilities are not evaluated in the 
absence of any background knowledge. On the contrary, their computation is 
normally made relative to a given set of side conditions. Thus; 'absolute' prior 
probabilities, assigned without regard to any previous body of knowledge, are 
not given any significant role in our system (Salmon [1968] and Shimony 
[1970] adopt a similar position with regard to scientific inference; cf. also Teller 
[1975]). 

Secondly, our pragmatic probability measures are not defined over universal 
languages, or even very powerful ones. Rather they are introduced in 
restricted languages which are still sufficiently rich to ensure the application of 
Bayes’ Theorem (cf. Garber [1983]). 

It should also be emphasized that the usual confirmation theorems can 
easily be interpreted in pragmatic probability terms. These results then show 
how the use of the probability calculus, under this interpretation, renders our 


346 Newton C. A. da Costa and Steven French 


choices between competing hypotheses both more rational and more ‘orga- 
nic’, (again certain aspects of Salmon’s and Shimony’s accounts can be 
adopted here). 


5.3 Simple Induction and Analogy 


By way of example, let us consider how simple induction and analogy can be 
reduced to the general hypothetico-deductive method. 

Induction by stmple enumeration proceeds as follows: from the premises 
that some x,, 1 <i<k, which belong to a class A, also belong to the class B, we 
'induce' the conclusion that Vx(xeA —xeB). 

Of course, such inferences are also based on a certain set T of side conditions, 
which may include such information as ‘no x is known which is a member of 
A, but not of B', 'the connection between A and B does not appear to be purely 
accidental’, etc. 

Simple induction may then be symbolized by the schema: 


x,€A>x 6B, X;€A 2 x;€B, . . ., X.cA— xycB 
Vx (xeA—xeB) 


where, to eliminate certain undesirable results, we may admit that the 
sentence x,—A belongs to I', for alli— 1, 2, .. ., n. In this schema the premises 
are logical consequences of the conclusion and thus an induction by simple 
enumeration may be logical converted into an instance of the strict 
hypothetico-deductive method. 

Analogy, in its simplest form, consists in inferences of the following kind: 


The object x, which belongs to the class A, also belongs to the class B; 
The object y belongs to A; 
Therefore, y belongs to B. 


r 


Again such reasoning is normally accompanied by a certain set of underlying 
conditions X, containing plausible reasons to support the inference. We can 
therefore convert analogy into the following inferential form: 


yeA y 
yeB 


where È is E plus the two statements that xcA, and xcB. Therefore analogy 
may also clearly be reduced to the generalized hypothetico-deductive method. 

In order to probabilize these two forms of inductive inference, in order to 
apply Bayes' Theorem etc., certain pragmatic probabilities must be evaluated. 
With regard to analogy, forexample, we have to evaluate the probability that 
yeA, given that yeB and the statements of £. It should be noted, however, that 
normally the probabilities involved are only roughly calculated; that is, we 
often proceed qualitatively. 

This leads us to recall the point made in the previous section, that in many 


Pragmatic Truth and the Logic of Induction 347 


cases qualitative, rather than quantitative, probabilities are all that are 
needed. The required theorems can then be obtained on the basis of a suitable 
axtomatization of this notion, such as is given in Koopman [1940], for 
example. 

We conclude this section by noting, first of all, that not all changes in our 
probability measures proceed according to Bayesian conditionalization, 
something which can be regarded as a consequence of thetr local character. 
We shall return to this point in the section below (see also da Costa [1986]). 

Secondly, and finally, we recognize that inductive logic as a whole 1s 
somewhat richer than the system described above; it includes, for example, the 
entire theory of elimination, in particular Mill’s methods (cf. von Wright 
[1951]). However, we believe that the outline presented here is, at the very 
least, a useful first step towards a more inclusive characterization. 


6 PHILOSOPHICAL DISCUSSION 
6.1 Characteristics of our System 


We begin by recapitulating the essential characteristics of the system of 
inductive logic developed here; it is tentative, local and instrumental, three 
characteristics which it shares with Shimony’s ‘tempered personalist’ view 
(Shimony [1970]). 

It is because the princtpal objective of any induction is to achieve some kind 
of tentative judgment, expressed in the form of a hypothesis, that induction as a 
whole can be visualized as an application of the hypothetico-deductive method 
and probabilized in terms of the subjective probability calculus. Furthermore, 
this tentative nature of our inductive inferences is expressed in the claim that 
such inferences do not aim for the truth, as such, but for ‘pragmatic’ or 'quasi-' 
truth. As we shall see, it is then reflected in our attitudes towards the 
acceptance of scientific theories. 

Our treatment is also local, in the sense that the analysis of an individual 
induction must be made in concrete terms, trying to take account of its 
relevant peculiarities. In other words, the application of pragmatic probabili- 
ties is circumscribed by the relevant conditions of a particular investigation 
and, in particular, such conditions change when the set of hypotheses under 
consideration change, leading to a related change in the assignment of our a 
priori probabilities. It is interesting to note a comparison here with Garber's 
treatment of Bayesianism, in which the logically possible words of the global 
language as usually formulated are replaced with more modestly constructed 
epistemically possible worlds, specifled according to our immediate interests 
(Garber (1983]). 

Finally, our position is essentially instrumental, in the sense that induction is 
regarded as merely a device for achieving ‘quasi-truth’ and must be evaluated 
from this perspective. In particular, other, alternative, systems may also be 


348 Newton C. A. da Costa and Steven French 


constructed, with the choice among these various possibilities being dictated 
by pragmatic as well as logical considerations. 

We have reemphasized these characteristics here because it is through them 
that our system can overcome the more serious objections usually levelled 
against the idea of inductive logic in general. 


6.2 The Problem of Universal Laws 


The first of these which we shall consider is the problem of universal laws (see, 
for example, Nagel [1963], or, for a more general account, Gillies [1987]). 

The problem is more or less the following: what probability should we assign 
to universal laws which have the form (Vx)Qx, where the quantifler ranges 
over a potentially infinite set of objects? Popper, as is well known, has argued 
that the prior probability of such a law is always zero, glving a degree of 
confirmation also equal to zero on a probability based confirmation theory 
(Popper [1972], Appendix VII). Given the apparently obvious fact that 
universal laws are common in science and can be positively confirmed, this 
presents a serious difficulty for any inductivist view. 

A well-known response to this objection is to accept that the a priori 
probability of a universal law is zero, but assert that, pragmatically speaking, 
science proceeds directly from particulars to particulars without the mediation 
of such laws (Carnap [1963], Section 110; Hesse [1974]). Thus what is 
confirmed in science are not universal laws but their particular instances, with 
the laws themselves possessing merely a heuristic value in terms of discovering 
predictions. 

Unfortunately this view cannot account for the use in physics, for example, 
of complex high-level theories containing various explicitly theoretical terms 
to make predictions. As Putnam has noted, there do exist examples in the 
history of science where there is no direct inductive evidence for a particular 
prediction r but where the evidence in conjunction with a certain group of 
relevant theoretical propositions gives r (Putnam [1975], Chapter 17). 

The challenge, therefore, is to show how putative universal laws may be 
accommodated within a particular system of inductive logic without eliminat- 
ing theories as a crucial part of the inductive process. That this challenge can 
be met by the system developed here 1s obvious if we consider the view of truth 
that is employed in most accounts of the problem. The usual (implicit) 
assumption as regards the (complete) truth of the laws concerned is clearly 
revealed by Gillies' reformulation of the problem in subjectivist terms (Popper 
having established his conclusion on the basis of the logical interpretation of 
probability): “Take, for example, 1— All Ravens are Black, and suppose A is 
forced to bet on whether 1 is true. A can never win the bet, sinceit can never be 
established with certainty that all ravens are black. However, A might lose the 
bet if a non-black raven happens to be observed. Thus the only reasonable 
betting quotient is q(1) — 0, which indeed can be considered as a kind of refusal 
to bet.’ (Gillies op. cit. p. 24-5) 


Pragmatic Truth and the Logic of Induction 349 


The crucial point is that A is forced to bet on whether the hypothesis is true. 
However it is obviously more realistic (in all senses of the word) to say that 
scientific propositions are pragmatically or quasi-true only, with our degree of 
belief in such a proposition treated not as a belief in its truth, as such, but in its 
quasi-truth and evaluated via the pragmatic probability calculus. It is then 
immediately clear that non-zero a priori probabilities can be given to 
supposedly universal hypotheses on this view. 

That is, ff the probability were to be interpreted simply in terms of the degree 
of belief in the absolute truth of the hypotheses then the only reasonable value 
to assign this probability is zero, but this does not follow 1f we take the 
probability to reflect the degree of belief in the hypothesis's partial or quasi- 
truth (da Costa [1986], p. 147). In the example above, A would then be forced 
to bet on the partial truth of ‘All Ravens are Black’, which of course allows him 
or her to take account of any uncertainty involved and the betting quotient 
would not then necessarily be zero. 

But what exactly do we mean by 'uncertainty' in this context? That is, how 
do we explicate the quasi-truth of universal hypotheses of the type 'All Ravens 
are Black'? One possible, and plausible, answer, as we noted above, is to say 
that such hypotheses are regarded as true at, or up to, a certain time only, the 
'uncertainty' involved being expressed through the possibility of potentially 
refuting instances being discovered at some future time (cf. the notion of 
‘skew’-truth developed in da Costa and French [1988]. Thus our subject A bets 
on 'All Ravens Are Black' being quasi-true in the sense that he or she accepts 
the hypothesis as being true within the domain of knowledge which is 
accessible (and here the background conditions play an important role) and up 
to the time of the bet, and is then betting on it continuing to be true In the 
future, up to some time to be specified. In other words, a temporal element is 
introduced into the pragmatic structures in this case and we venture to 
suggest that it is the existence of such an element which distinguishes the 
somewhat artificial hypotheses of this form from the more complex and 
essentially timeless laws of science. This obviously leads to further consider- 
ations of the nature of such laws which we cannot pursue here. 

It is also important to be clear about the consequences of the above move. 
We are not saying that universal theories can be eliminated 1n the Carnap- 
Hesse manner, with probabilities being assigned to particular instances only, 
but that such theories should not be regarded as 'universally', in the sense of 
absolutely, true and that our degrees of belief should apply to partially true 
propositions. At this point we may echo Dorling's words: 'What is important is 
not how philosophers construe physicists’ theories but how physicists 
construe them.’ (Dorling [1972], p. 183.) But it is philosophers, and not 
physicists themselves, who assert the ‘literal’, that ts, absolute and untversal, 
truth of theories. 

Furthermore, by not eliminating such theories from the inductive process, 


350 Newton C. A. da Costa and Steven French 


our system allows us to retain them in establishing the plausibility of a 
prediction r on the basts of the evidence e. It is therefore entirely capable of 
meeting Putnam’s challenge above. We recall that an induction is symbolized 
thus: 


03, X2 > . e n r 
a 


where I" represents the set of underlying conditions, including theories of 
various sorts, which, together with the œ, contribute to the plausibility of œ, 
this plausibility betng evaluated by means of subjective pragmatic probabilities. 
Thus, in the particular case of Putnam’s example, we are concerned with the 
degree of belief in the quasi-truth of r, given the empirical premiss e and a 
certain set of underlying theories included tn I', which are themselves only 
quasi-true as well. 


6.3 The Problem of a priori Probabilities 


This question of the a priori probability of universal laws is clearly just a special 
case of the problem ofthe assignment of a priori probabilities in general; that is, 
since the latter may be regarded as essentially undetermined within the 
subjectivist theory itself (see, for example, Howson [1985], pp. 305-9), there 
arises the possibility of different (ideal) subjects making radically different 
assignments for the same initial hypothesis. 

In reply, we note that we can adopt a version of Shimony's answer to this 
problem (Shimony [1970], especially pp. 102-3). We begin by recalling the 
requirement that various hypotheses concerning the subject matter under 
consideration should be formulated in order to increase the reliability of our 
method. In other words, we begin with not one but a set of hypotheses, the 
limits on this set depending on certain general principles to be outlined below 
and the context of the investigation (Shimony op. cit. pp. 110-14). 

If we then allow that some a priori probability can be assigned to the 
hypotheses, and our discussion above suggests that it always can, and that one 
and only one of the set of hypotheses ts significantly more quasi-true than the 
alternatives (this might be called a realist demand), so that the likelihood p(e/ 
h) of a piece of evidence upon this hypothesis is much greater than for the 
others (which seems reasonable, given our realist demand), then any 
difference in a priori probabilities will be effectively ‘swamped’, as far as a 
consideration of the posteriori probabilities ts concerned (Shimony ibid. pp. 
102-3). In other words, given that we are concerned with the formulation and 
comparison of various hypotheses and not Just one, it can be shown that 
differences in a priori probabilities become insignificant in the application of 
the probabilistic approach and (social) consensus may then be achieved. 

However, these considerations imply that a non-negligible prior probability 
must be given to any new hypothesis which is suggested during the 


Pragmatic Truth and the Logic of Induction 35I 


investigation of the subject matter under consideration, leading to a redistribu- 
tion of the probabilities already assigned and a possible violation of the axioms 
of probability (Shimony ibid., p. 104). 

Our response in this case is to note that the pragmatic probability 
interpretation, like the tempered personalist view, is local, in the sense given 
above. Thus when the set of alternatives is augmented in this way, the 
conditions ofthe investigation are changed and our prior probabilities must be 
re-evaluated. It is important to note for what follows that the use of the 
probability calculus to infer the prior probabilities associated with the new 
conditions from those associated with the old, can lead to some bizarre results. 


6.4 The 'Hacking Problem' 


This is again a special case of a general problem, referred to by Gillies as ‘the 
Hacking problem’ (Gillies [1987], pp. 19-23) and the point just made provides 
an answer to the latter as well. 

The nature of the problem can be made clear in the following way: At some 
time t, before evidence e has been collected, our subject X assigns betting 
quotients q,(h) and q,(h/e) on h and h given e respectively. At time t’, where 
t'7t, eis known to be the case and this is the only extra information X has 
acquired since t. X now assigns a quotient of q,(h) to h. Then our subject has 
changed his or her belief according to Bayesian conditionalization provided he 
or she sets 


q«(h) - a(h/e) 


Hacking calls this the ‘dynamic assumption’ and argues that the condition 
of coherence, which is essential to Bayesianism, does not compel X to satisfy 
this assumption (Hacking 1967, pp. 313-16). 

That this assumption might be less than plausible is clatmed by Gillies 
through the following example: 

‘If A is asked to bet on some random process 2, it would be quite reasonable 
to assume at first that # consists of independent events, and to calculate his or 
her betting quotients accordingly. This amounts, within the subjective theory, 
to making the assumption of exchangeability, that is, the assumption that the 
order of the events is of no significance. The observation of a few hundred 
results of 9 may, however, convince A that order is relevant after all, that the 
sequence exhibits dependencies and after-effects. At this stage, 1f asked to bet 
again, A might want to abandon the assumption of exchangeability and use 
quite a different scheme for calculating his or her betting quotients. But these 
new betting quotients will not then be obtained from the old ones by Bayesian 
conditionalization (the dynamic assumption).’ (Gillies op. cit., p. 22) 

However, if the probabilistic scheme is local, as ours is, then this objection 
loses all its force. As we noted above, any change in the hypotheses under 
consideration implies a change in the underlying conditions of the investiga- 


352 Newton C. A. da Costa and Steven French 


tion and a reassignment of our a priori probabilities (or betting quotients). That 
such a re-evaluation can easily be accommodated within our approach can be 
seen if we recall our other requirement that for reliability’s sake we should 
consider more than one, indeed a whole set of, hypotheses. Thus, in Gillies’ 
example, above, we would ask A to first of all consider at least two hypotheses: 


h = consists of independent events (exchangeability); 
h'— 2 consists of dependent events (some assumption other than 
exchangeability), 


These form part of the ‘background conditions’. 
Given that he or she is told that 4 is random (more background) A would 
then reasonably assign a higher prior probability to h than to h’, Le. 


p(h)> pth’) 


If he or she is open-minded then p(h’) #0. 

Two sets of betting quotients are then made for the events {e,} of on the 
basis of each assumption respectively. After observing n events in the process, 
A comes to believe that exhibits dependencies and will therefore accord a 
higher posterior probability to h’ given {e,}, than to h. That is, 


p(h’/{en}) > p(h/{ea}) 


the posterior and prior probabilities for each hypothesis being related by Bayes’ 
Theorem as usual. 

Given this result it is then entirely rational for A to shift allegiance from h to 
h’, where such a shift involves believing that the latter is the more 
pragmatically true. If then asked to bet again at this stage, that is to bet on 
event e, +1, Ìt is then also entirely rational for A to make this bet on the basis of 
the betting quotients established according to h'. The fact that these quotients 
are not obtained by Bayesian conditionalization from the previous ones, based 
on assumption h, should not come as any surprise, since A is now operating 
within a different background context and Bayes' Theorem only applies to 
changes in our probability assignments within the same context. 

In other words, being local in our sense means abandoning the ‘dynamic 
assumption’, at least as far as changes in the background conditions are 
concerned (after all, whoever said that Bayes' Theorem had to apply to such 
changes as well!) and since we have, in this case, no dynamic assumption to 
justify, Hacking's objection loses its bite. 


6.5 Induction and Theory Change in Science 


This then brings us on to the question of theory change in science in general 
which is connected, as we shall now see, to at least one aspect of the so-called 
‘problem of induction’. 


Pragmatic Truth and the Logic of Induction 353 


Following Oliviera [1985], the latter may be conveniently reduced to the 
following set of three questions: 


1. what is the nature of the attitude of acceptance that we adopt in relation 
to certain theories? 

2. what are the rules of acceptance for theories? 

3. what is the justification for the adoption of those rules? 


An answer to the first can be given in terms of the pragmatic acceptance of a 
theory: to pragmatically accept a theory means to believe that it has not been 
refuted within the domain of knowledge which it models and that it is 
pragmatically true (cf. Oliveira op. cit. pp. 133-4: to accept a theory means to 
believe that it has not been refuted and that it contains an ‘element of truth’). 
The belief in the theory's pragmatic truth is then reflected in our subjective 
probability assignment. 

Thus we claim that one should accept a hypothesis which has an a posteriori 
pragmatic probability of 1, where this value reflects our degree of belief in the 
partial- or quasi-truth only of the hypothesis. It is worth comparing this with 
Shimony’s tempered personalism approach, where he argues that we should 
accept a hypothesis which has an a posteriori probability close to, but not 
equal to, 1 (Shtmony op. cit. pp. 120-1 and pp. 130-1), the difference 
expressing the tentative attitude maintained towards the hypothesis. On our 
view, however, this tentative attitude is reflected, not in terms of a difference in 
numerical values, but rather in terms of the difference between quasi- and 
‘complete’, or pragmatic and ‘absolute’, truth. This, we would argue, ts a more 
intuitive way of capturing the idea that commitment to a hypothesis is weaker 
than belief in its literal truth. R 

Just to be absolutely clear on this point: when we say that hypotheses 
acquire a posteriori probabilities of 1, we mean probabilities of 1 that they are 
pragmatically, and not absolutely or literally, true. These are pragmatic 
probabilities that we are talking about, the introduction of which is motivated 
by our desire to capture precisely our tentative attitude towards hypotheses 
which 'go beyond the evidence'. 

This approach obviously meshes well with a 'cumulative' account of 
scientific progress, as we have already noted. While we do not intend to enter 
into all the details here (see, for example, Post [1971]; an excellent, but 
relatively neglected, defence of such a view) it is worth noting that the so-called 
General Correspondence Principle, which, put rather prosaically, basically 
says that we never lose 'the best' of what we had, is mirrored within our system 
by what might be called 'the Principle of the Permanent Nature of Pragmatic 
Truth’: once a theory has been shown to be pragmatically true in a certain 
domain it remains pragmatically true, within that domain, for all time (cf. 
Oliveira op. cit. p. 134). It is this, of course, which lies behind the justification 
for continuing to use Newtonian mechanics within certain limits. 


354 Newton C. A. da Costa and Steven French 


Turning now to the second question above, we note that in the context of an 
inductive logic it can be broken down into two parts: 


(i) what are the rules for the a posteriori acceptance of theories? 
(ii) what are the rules for the a priori selection of theories? 


In the first case, our answer is simply that such rules are embodied tn the 
calculus of pragmatic probability. We accept a hypothesis h when its pragmatic 
probability, conditional on evidence e, is equal to 1, reflecting our degree of 
belief in the partial truth of h, as we have just said. It is worth pointing out that 
the conditionalization may be not just on the evidence but, in some cases, on 
the existence of a connection between the hypothesis and the evidence, as 
expressed by the statement 'h implies e' (da Costa and French [1988]). 

The importance of the second sub-question is often not generally acknow- 
ledged and answering it is a less straightforward matter than for the first. The 
crucial problem here is how do we rule out ‘absurd’ or ‘ridiculous’ hypotheses 
to begin with, or how can we guarantee that the hypotheses to which we 
assign our initial degrees of belief are 'serious'? (Shimony op. cit. pp. 110-14; 
cf. Oliveira op. cit. pp. 136-7). 

It can be said that there is usually a general and intuitive agreement within 
a particular scientific community about which hypotheses are to be taken 
'seriously' within the domain under consideration. Nevertheless, it is difficult, 
perhaps even impossible in principle, to codify a set of rules which can 
unambiguously distinguish those hypotheses which are seriously proposed 
from those which are not. We would argue that we can, however, set down 
certain 'desiderata' (see Post op. cit. pp. 221—43) in the sense of methodologi- 
cally sensible guidelines, for choosing serious hypotheses as members of our 
initial set and the fact that we can do so, and that such desiderata may seem 
plausible, suggests that our second problem above can be resolved. In 
particular, it should be emphasized that scientific hypotheses are not, of 
course, proposed within an intellectual vacuum but with respect to certain 
'background information' which confers an initial plausibility upon them even 
before 'experience' is brought to bear. However, these points take us deep into 
the area of heuristics which, again, we do not intend to discuss here. Suffice to 
repeat that by way of such desiderata we may suitably restrict the set of 
alternative hypotheses to be formulated in order to increase the reliability of 
our inductive conclusions. 

Finally, we come to question 3 above, the question of the justification of our 
inductive mechanisms, commonly regarded as 'the' problem of induction. 
Obviously we cannot do justice to it here, much less offer any kind of adequate 
‘answer’. What we would like to emphasize, however, is that, in our terms, it is 
reasonable to accept inductive inferences because we accept them according to 
the rules of the pragmatic probability calculus. The whole basis of our 
approach lies in the claim that tnduction points to pragmatic truth and we 


Pragmatic Truth and the Logic of Induction 355 


search for such truth following our own (presumably candid) degrees of belief. 
To proceed this way is to be rational. In other words, to be rational is to 
(pragmatically) accept those propositions which are better grounded in the 
available evidence. Of course, the fundamental basis of this attitude is 
subjective, in that it arises in the first place from our intuitions concerning the 
weights of beliefs, but what other basis could there be? The most we can do is to 
proceed according to our ‘sound’ beliefs, although we are unable to prove that 
they rigorously correspond to reality. 

In particular, ‘Addicts of deduction should not harbor flattering illusions’. 
(Black [1970], p. 144). If there is a ‘problem’ of induction, then there is one of 
deduction as well: why and how does deduction function in the way that it 
does? Does the underlying language of a deduction reflect reality? If not, then 
our deduction is vacuous but if so then how do we know that it does? By 
induction? (da Costa [1981], pp. 2-10). 

What we have tried to do in this section is simply to indicate some of the 
more interesting connections between the approach presented here and 
various other discussions in the philosophy of science; we hope to further 
develop these points in a future work. 


ACKNOWLEDGEMENTS 


The authors would like to express their sincere thanks to Greg Hunt and an 
anonymous referee for many helpful suggestions regarding this work and to 
Professors R. Chuaqui and E. Napoli for many useful discussions on the 
problem of inductive inference. We would also like to thank the Institute for 
Advanced Study of the University of Sáo Paulo and the Centre of Logic, 
Epistemology and History of Science of the University of Campinas for support 
given during the writing of this paper. One of us (French) also wishes to 
acknowledge financial support received from the Brazillan National Council 
for Scientific and Technological Development (CNPq). 


Department of Philosophy 
University of Sáo Paulo 
São Paulo, SP, Brazil 


Department of Philosophy 
University of Campinas 
Campinas, SP, Brazil 


REFERENCES 


Back, M. [1970]: ‘Some Half-Baked Thoughts About Induction’ in Margins of 
Precision. Cornell University Press, pp. 137-44. 
CARNAP, R. [1963]: The Logical Foundations of Probability. University of Chicago Press. 


356 Newton C. A. da Costa and Steven French 


pa Cosra, N. C. A. [1981]: Légica Indutiva e Probabilidade. Publication of the 
Mathematics and Statistics Institute of the University of Sáo Paulo, Brazil. 

DA Costa, N. C. A. [1986]: ‘Pragmatic Probability’, Erkenntnis, 25, pp. 141-62. 

DA Costa, N. C. A. [1987]: ‘Outlines of a System of Inductive Logic’, Theoria, 7, pp. 3-13. 

DA Costa, N. C. A. and Ferenca, S. [198?a]: "The Model-Theoretic Approach in the 
Philosophy of Science’, forthcoming in Philosophy of Science. 

pa Costa, N. C. A. and FRENCH, S. [1988]: ‘Pragmatic Truth, Logical Omnisclence and 
the Popper-Miller Argument’. Fundamenta Scientiae, 9, pp. 35-46. 

DA Costa, N. C. A. and Cauagui, R. [198?b]: ‘The Logic of Pragmatic Truth’, 
forthcoming in Erkenntnis.. 

DoRLING, J. [1972]: 'Bayesianism and the Rationality of Science’, British Journal for the 
Philosophy of Science, 23, pp. 181~90. 

DE Finertt, B. [1970]: Teoria delle Probaliti. Einaudi. 

GARBER, D. [1983]: ‘Old Evidence and Logical Omniscience in Bayesian Confirmation 
Theory’, in J. Earman (ed.) Testing Scientific Theories. University of Minnesota Press, 
pp. 99-132. 

GuLæs, D. [1987]: ‘Probability and Induction’, In G. M. R. Parkinson (ed.) The 
Encyclopaedia of Philosophy. Croom-Helm. 

HAckKING, I. [1967]: ‘Slightly More Realistic Personal Probability’, Philosophy of Science, 
34, pp. 310-16. 

Hesse, M. [1974]: The Structure of Scientific Inference. Macmillan. 

Howson, C. [1985]: ‘Some Recent Objections to the Bayesian Theory of Support’, 
British Journal for the Philosophy of Science, 36, pp. 305-9. 

Koopman, B. O. [1940]: ‘The Axioms and Algebra of Intuitive Probability’, Annals of 
Mathematics, 41, pp. 269-92. 

LINDLEY, D. V. [1965]: Introduction to Probability and Statistics from a Bayesian Viewpoint. 
2 Vols. Cambridge University Press. i 

MIKENBERG, I., DA COSTA, N. C. A. and Cavagul, R. [1986]: ‘Pragmatic Truth and 
Approximation to Truth', Journal of Symbelic Logic, 51, pp. 201-21. 

NAGEL E. [1963]: 'Carnap's Theory of Induction’ in P. A. Schilpp (ed.) The Philosophy of 
Rudolf Carnap. Open Court Press. 

OLIVERA, M. B. nz [1985]: ‘The Problem of Induction: A New Approach’, British Journal 
for the Philosophy of Science, 36, pp. 129-45. 

Popper, K. R. [1959] and [1972]: The Logic of Scientific Discovery. Hutchinson. 

Posr, H. R. [1971]: ‘Correspondence, Invariance and Heuristics: In Praise of 
Conservative Induction', Studies in History and Philosophy of Sclence, 2, pp. 213-55. 

PuTNAM, H. [1975]: ‘Degree of Confirmation and Inductive Logic’ tn Philosophical 
Papers. Vol. I. Cambridge University Press. 

SALMON, W. E. [1966]: ‘The Foundations of Scientific Inference’ in R. G. Colodny (ed.) 
Mind and Cosmos. University of Pittsburgh Press. pp. 135-275. 

SALMON, W. C. [1978]: ‘Unfinished Business: The Problem of Induction’, Philosophical 
Studies, 33, pp. 1-79. 

Sammons, A. [1970]: ‘Scientific Inference’ in R. G. Colodny (ed.) The Nature and Function 
of Sclentific Theories. University of Pittsburgh Press, pp. 79-172. 

TELLER, P. [1975]: 'Shimony's A Priori Arguments for Tempered Personalism', in G. 
Maxwell and R. M. Anderson (eds.) Induction, Probability and Confirmation. 
University of Minnesota Press, pp. 166-203. 

VON WRIGHT, G. H. [1951]: A Treatise on Induction and Probability. Harcourt Press. 


Brit. J. Phil. Sci. 40 (1989), 357-364 Printed in Great Britain 


Aesthetic Constraints on Theory Selection: 
~ A Critique of Laudan 


JAMES E. MARTIN 


1 Introduction 

2 Laudan’s Conditions on Epistemic Values 

3 Aesthetic Values: Rational Conditions on Knowing 
4 Summary and Conclusions 


I INTRODUCTION 


In his Science and Values [1984], Larry Laudan considers the role of axiological 
constraints on theory selection. He presents criteria which are purported to 
characterize those epistemological values that have a legitimate place in 
scientific reasoning. In this context, he rejects the relevance of aesthetic values 
such as simplicity and elegance to rational activity because they do not satisfy 
the criteria he has established. But the evidence is that a number of great 
scientists from Kepler to Einstein have reported themselves to have been 
influenced by such constraints. This presents a difficulty for Laudan’s account. 
Were Kepler and Einstein acting irrationally in permitting themselves to be 
guided by aesthetic values? Not necessarily. 

This paper is a critique of some aspects of Laudan’s account of values, and 
the notion of rationality he seems to presuppose. Much of his discussion is 
disappointingly ambiguous, but his general thrust is clear enough. Laudan 
requires of values that they be capable of being ‘operationalized’ (explicitly 
conceivable and fully realizable) in order that they have a place tn rational 
activity. And this requirement eliminates aesthetic values from the realm of 
legitimate epistemic constraints. 

But suppose the case 1s otherwise. Suppose aesthetic factors are significant 
and rational determinants of the activities of natural scientists. Then another 
view of epistemic values is called for. An alternative is proposed in this paper 
which sheds light on some of the ways in which values, aesthetic and 
otherwise, operate to guide the scientific imagination in respect of both the 
discovery and confirmation of hypotheses. The view of scientific rationality 
achieved thus mitigates the distinction between the so-called contexts of 
discovery and justification. 


358 James E. Martin 


2 LAUDAN'S CONDITIONS OF EPISTEMIC VALUES 


Laudan proposes several critieria for evaluating cognitive values. Two of them 
will be considered here. The first is consistency. Laudan assumes without 
argument that cognitive goals must be consistent (p. 50). But is this true? Is it 
not often the case that the values which direct thought in both the discovery 
and evaluation of hypotheses present themselves as antagonistic palrs—unity 
and diversity, clarity and richness of texture, consistency and completeness, 
and so on? Such value pairs are inconsistent in the sense that the full 
realization of one member of a pair would often preclude the full realization of 
its partner. So, it appears that we aim at a kind of harmony between the two 
antagonistic values of such pairs. Moreover, it is suggested that the judgement 
that a theory embodies an appropriate balance or harmony between two or 
more antagonistic epistemic values is, in the broad sense, aesthetically based. 

That the ultimate consistency of cognitive goals is a precondition for 
rational cognition is certainly open to question. The above examples support a 
conclusion to the contrary that rational cognition depends precisely on the fact 
that it is constrained by antagonistic, even inconsistent goals. Suppose an 
Investigator were continually moved only by the desire for unity, clarity, and 
consistency of vision, at the expense of diversity, richness of texture, and 
completeness of vision. Would he be behaving rationally—either with respect 
to the discovery or evaluation of hypotheses? Surely not. In what sense could it 
be rational, if one were concerned to understand a domain of inquiry, to 
continually and autistically ignore those aspects of the domain which could 
only be appreciated by allowing oneself to be directed toward diversity, 
richness and completeness of vision? If such behavior were to occur in 
everyday life, it would unquestionably be thought clinically and pathologically 
irrational. In science as well, rational thought requires balancing and 
harmonizing the conflicts that inevitably arise among inconsistent cognitive 
values. 

The above considerations are supported by the available evidence on 
creativity and change both within and without the natural sciences. The fact 
that cognitive goals are antagonistic presents one of the central problems for 
scientific invention. The scientific imagination is constrained by the require- 
ment that it invent theoretical schemes which enable the scientist, insofar as 
possible, to simultaneously further the conflicting values that constrain him. 
Holton [1973, 1978] has described the way in which relativity theory was for 
Einstein just such an achievement. Much of the beauty of relativity theory lies 
in the fact that Einstein was, to a remarkable degree, able to simultaneously 
realize the generally antagonistic interests of the radical empiricism of Mach 
and an almost Parmenidean vision of invariant being. 

On other occasions theoretical invention does not so much consist in 
achieving a synthesis of opposing goals as in taking up and furthering a goal 


Aesthetic Constraints on Theory Selection 359 


that may have been ignored by the scientific community within which one 
works, Holton, again, has given us a number of examples of cases where 
scientific debate and paradigm shifts have centered around conflicts between 
scientists who held conflicting values (i.e., who were devoted to conflicting 
scientific themes). In such cases the antagonistic goals constrain the activity 
and development of the community as a whole, even if a synthesis is not 
discovered. In none of these cases is it thought that cognitive behavior is 
irrational because it 1s directed by inconsistent values. Instead, the behaviour 
of individuals and groups would be serlously irrational if the persons involved 
persisted in seeking only consistent goals. 

A second condition which Laudan says epistemic values must achieve is 
non-utopianism (pp. 50-3). He defines a goal state or value as utopian if ‘we 
have no grounds for believing that it can be actualized’. This is remarkable. 
Surely there are many values one pursues even though one does not expect to 
fully realize them. A number theorist, for example, might pursue mathematt- 
cal consistency and completeness and yet be aware that he will never fully 
achieve them. A scientist seeks a theory which allows him to completely 
capture both the unity and diversity of the domain he studies, yet he knows 
this is practically, if not theoretically, impossible. But if Laudan were to claim 
that the persons involved were really pursuing something less, we would ask 
him to specify that lesser goal. Presumably he must be able to specify the goal 
in question in order to decide upon its reasonableness. Such specification 
might take the form of a theory that realized the limited values in question. If 
we then took that specification to the mathematician or the scientist and asked 
them whether it fully satisfied the epistemic values they were seeking to realize, 
we could be fairly sure that the answer would be negative. No theory is perfect. 

The criterion of non-utopianism leads to several anomalies. First, the claim 
that a set of epistemic values is not utopian is equivalent to the claim that it is 
possible to construct a theory that would fully satisfy the values in question. 
Only in the unlikely case that we knew such a construction were possible could 
we be sure that the relevant values were not utopian. In most cases we could 
only be certain of the non-utopian character of a set of epistemic values by 
giving a theory that fully satisfied those values, But given the tmprobability (in 
science) of finding such a theory, the hope that one will be able to establish that 
a set of values is non-utoplan is itself utopian. Second, if the non-utopian 
character of the epistemic values can be established only after they have been 
satisfled by an adequate theory, whether or not those values are utopian will 
be irrelevant to the assessment of that theory. Third, for the above reasons, 
the values in question could not provide any ‘rational’ guidance for the 
development of the theory before it was finalized. 

There is a fourth difficulty with Laudan’s non-utopianism. We have seen the 
improbability that one will be able to show that values are non-utoptan. But 
how, apart from showing inconsistency, will anyone ever prove that they are 


360 James E. Martin 


utopian? Laudan suggests that we will be able to see that they cannot be 
realized in light of the laws of nature. But is it not clear that in doing science it is 
precisely the laws of nature that are at stake? In most cases, the claim that a 
consistent set of values is utopian would seem to be undecidable. 

Laudan discusses a variant of his non-utopian condition under the heading 
of specifiability. It is his view that an epistemic value is utopian if ‘there is no 
objective way to ascertain when that aim has been realized and when it has 
not’. For this reason he rejects such aesthetic values as simplicity and elegance 
which many scientists have claimed to be important guides for their thought. 
They are, he says, too imprecise and allow for no definite characterization. The 
reason for this is that the way in which a theoretical solution is elegant, or 
simple, or balanced, will depend on the unique characteristics of the problem at 
hand. Thus there can be no way to establish a well-defined criterion of 
theoretical beauty ahead of time, and therefore no objective way to ascertain 
when the aim has been realized. 

We are now in a position to see clearly the assumption that is behind 
Laudan’s various conditions on epistemic values. Laudan takes it for granted 
that legitimate and legitimizing values or goals are definite in the sense that it is 
possible to spell out ahead of time the condition of their having been realized. 
Thus Laudan does not draw a distinction between goals and values, using the 
terms interchangeably in the text. Hereafter, I use the term goal to refer to such 
definite ends as he describes, and reserve the term value for another context. 
My reason for doing so is that it ls questionable whether the sort of definite and 
explicit ends which he considers are capable of providing a context within 
which theories may be assigned epistemological value or warrant. 

It has never been shown that it is possible to give a set of definite fully explicit 
criteria the satisfaction of which provides a compelling warrant for a set of 
scientific claims. The most obvious reason for this is, as Plato pointed out in the 
Meno, that one does not know in a fully explicit way what the truth is like until 
one finds it. This is why scientists are often surprised. A second reason is that if 
a goal can be explicated, it would seem necessary to validate the goal as well. If 
this were not done, how could one justify the use of that goal to validate a 
scientific theory? Laudan suggests that cognitive aims be evaluated in relation 
to those methodological aims that already inform scientific practice. Scientists 
should preach what they practice. But this does not solve the problem at hand 
for we can always ask for the warrant for current scientific practice. At any 
level of analysis, to commit oneself to a set of explicable but unsupported 
criteria would be an act of bad faith that might obscure, instead of revealing 
reality. 

Laudan understands this and consequently rejects any but a relativistic 
notion of scientific progress. He avoids a bad faith commitment to a particular 
set of explicable goals by allowing that there is progress only ‘relative to some 
set of aims’, Once we understand this we are presumably freed from bad faith 


Aesthetic Constraints on Theory Selection 361 


claims to an absolute standpoint. But the problem is that Laudan has reached 
this clarity at the cost of relativism. Laudan does not avoid the relativism 
tmplicit in commitment to explicable ends. On the contrary, he embraces it. 


3 AESTHETIC VALUES: RATIONAL CONDITIONS ON KNOWING 


Because it does not appear that the sort of explicit goals Laudan suggests can 
warrant scientific theories, I propose that theories are warranted in the context 
of values which are distinct from definite goals in at least two respects. First, 
values are open. They are not capable of once and for all explication as goals 
are. As they unfold, they can serve as guides for the discovery of genuinely 
novel results. Second, values may be seen as providing a real, if tacit, access to 
the object of inquiry. In this way they serve to provide a context within which 
truth claims can be meaningfully and rationally warranted (i.e., epistemically 
valued). Without committing oneself to the Platonic doctrine of forms, this 
view allows one to give an answer to the question that inspired the Meno— 
'How can we inquire concerning that which we do not know?' My answer is 
like Plato's in that I claim that the scientist has a real, but inevitably tacit 
access to the world. Laudan, on the other hand, not having felt the force of the 
Socratic dialectic, still seeks explicit standards of scientific conduct. But in 
doing so, he ts led as ineluctably as Meno was to accept a relativistic notion of 
epistemic progress. 

But the relativism which Laudan preaches is inconsistent with his practice. 
Nowhere in Science and Values does Laudan appear to accept the relativism 
with respect to cognitive aims that he would foist upon the scientific 
community. To the contrary, he is quite definite about what does and does not 
constitute rational activity—even to the point of suggesting that to allow 
sclentific work to be guided by utopian values is ‘pathological’. But on his own 
account, he ought to preach what he practices. Accordingly, the relativism to 
which he is driven by his adherence to explicit norms is not acceptable In his 
own terms. 

For reasons such as these Martin and Kleindorfer [1986] and Martin, 
Kleindorfer and Brashers [1987] have been led to their position that it is 
through a tacit and real access to reality that the scientific imagination is 
guided. 

The doctrine that the scientific imagination is ultimately aesthetic in nature, 
and that utopian aesthetic values provide an access to reality is not novel. For 
example, in the The Foundations of Science [1913], Polncare argued that 
aesthetic values direct the attention to aesthetically arresting hypotheses and 
facts during the creative process. Moreover, Poincare thought that it was 
rational to accept the direction of aesthetic values in the sense that it is by 
accepting such direction that one is led toward the truth. Poincare argues, ‘It is 
therefore the quest of this especial (intellectual) beauty, the sense of the 


362 James E. Martin 


harmony of the cosmos, which makes us choose the facts most fitting to 
contribute to this harmony, just as the artist chooses from among the features 
of his model those which perfect the picture and give it character and life. And 
we need not fear that this instinctive and unavowed prepossession will turn 
the scientist aside from the search for the true. One may dream of a 
harmonious world, but how far the real world will leave it behind!’ (p. 367). 

Precisely because the access of mind to the world is actual, the values 
through which it is achieved remain utopian and cannot be fully explicated. 
Thus, it is not surprising that those values are antagonistic to one another. 
Taken together they constitute a coincidentia oppositorum, united through their 
necessary complementarity as perceived by the aesthetic sense. Moreover, it is 
only insofar as the opposing motives (e.g., unit and diversity) are taken 
together and remain in dialectical tension that they represent genuine 
epistemic values. It is the antagonistic and complementary character of these 
values that underlies the contrapuntal and harmonic nature of the scientific 
dream to which Poincare referred in the passage quoted above. 

Moveover, there is evidence from outside the natural sciences that 
inconsistent values are at the core of competent cognition. Such inconsisten- 
cies are understood to be adjudicated and, when possible, harmonized in the 
aesthetic imagination. Thus, the account given here is not unique to the 
physical sclences, nor to uniquely epistemic values. Jung [1923], Schumacher 
[1977] and Rothenberg [1979] have argued that aesthetic imagination is 
similarly operative in harmonizing antagonistic values in psychological 
development, the construction of social institutions and writing poetry, 
respectively. The view introduced here in opposition to Laudan serves to 
integrate our understanding of scientific values into the broader structure of 
human valuing. 

The language in which such opposing values are most directly expressed is 
not the language of logic, but of metaphor. Because it serves to hold together 
opposing elements, metaphor provides a standpoint from which the tension 
among conflicting and complementary values may be appreciated and 
brought to bear on the development and criticism of knowledge claims. In 
contrast to logic (the language appropriate for explicable structures), meta- 
phoric language guided by the aesthetic sense is the vehicle through which 
values ofthe sort discussed here are often expressed. Thus it is possible to frame 
‘root metaphors’ that express fundamental orientations tn terms of which 
scientific proposals may be evaluated, and accordingly, some notion of 
progress defined. 

Of course, the term ‘root metphor’ brings to mind the fact that there are 
various root metaphors which have held claim to competent scientific 
imaginations. But it has been suggested that the scientific community as a 
whole is held together by (among other things) a common, if not fully 


Aesthetic Constraints on Theory Selection 363 


explicable, access to reality. So, from the present point of view, even the 
clashing root metaphors that are operative in the sciences (e.g., as described in 
Holton’s well-known work) may be viewed as expressing values that are 
antagonistic and yet complementary to one another. 

Finally, the notion of aesthetically constrained epistemic values promises to 
shed light on one of the most perplexing problems in current philosophy of 
science. That problem is that paradigm shifts appear in some cases to 
constitute discontinuous, incommensurable systems, and yet many shrink 
from the relativism they belteve incommensurability to require. Moreover, the 
attempt to avoid the relativism thought to result from incommensurability by 
maintaining explicit standards for theory evaluation leads, as we have seen, 
right back to relativism. On the present account, however, one is not forced to 
choose between commensurability and relativism. A new paradigm may be 
incommensurable with its predecessor in the sense that there ts no explicit rule 
(or well-defined goal) in terms of which the shift may be measured. At the same 
time, a paradigm shift may represent a real advance in knowledge because it is 
an elaboration of one or more of the not fully explicable epistemic values that 
provide the access a community of scientists has to a domain of inquiry. In 
some cases, a paradigm shift may represent a change in root metaphor; in 
others, the playing out of an unexploited aspect of the root metaphor already 
dominant in the field. In any case, the view introduced here appears to allow 
for incommensurability in terms of explicit rules as well as a non-relativistic 
notion of scientific change. This intriguing possibility deserves further 
exploration. 


4 SUMMARY AND CONCLUSIONS 


This paper has introduced a proposal concerning the role of aesthetic values in 
the development of science. I have offered a critique of a contemporary analysis 
of the place of values in science as a point of departure for discussion. It has 
been argued that the view that scientific goals can be explicated leads 
inevitably to a self-stultifying relativism. In contrast J have attempted to revive, 
in terms of epistemic values, the Platonic doctrine that epistemic progress is 
possible because knowers possess a real but inevitably tacit access to reality 
(see Kleindorfer and Martin [1983]; Martin and Kleindorfer [1986]; Martin, 
Kleindorfer and Brashers [1987], for further discussion). It has been suggested 
that it is the aesthetic imagination that both constrains and enables the 
operation of antagonistic but complementary values in the development of 
knowledge. Theories are valued insofar as they are seen as simultaneously 
satisfying the demands of conflicting epistemic values. Such satisfaction is the 
experience of scientific beauty. It results in the conviction that the theory is, in 


364 James E. Martin 


some important respect, true. This too is an ancient Platonic doctrine— 
‘Beauty is the radiance of truth’. 


Department of Psychology 
The Pennsylvania State University 
University Park, PA 16802 


REFERENCES 


Hoxton, G. [1973]: Thematic origins of sctentific thought: Kepler to Einstein. Cambridge, 
MA: Harvard University Press. 

Hotton, G. [1978]: The scientific imagination: Case studies. London, England: Cambridge: 
University Press. 

Jona, C. [1929]: Psychological types. Princeton, NJ: Princeton University Press, Ballinger 
Paperback. 

KLEINDOREER, G. B. and MARTIN, J. E. [1983]: The iron cage, single vision, and Newton's 
sleep. Research in Philosophy and Technology, 3, pp. 127-42. 

LAUDAN, L. [1984]: Science and values. Berkeley, CA: University of California Press. 

MARTIN, J. E. and KrziNDORFER, G. B. [1986]: Mind as a rule-governed device: Tom Swift 
and his amaxing truth machine. Logos, 7, pp. 35-56. 

MARTIN, J. E., KLEINDORFER, G. B. and BRAsHERS, W. R. [1987]: The theory of bounded 
rationality and the problem of legitimation. Journal for the Theory of Social 
Behaviour. 

PoiNcARE, H. [1913]: The foundations of science. Lancaster, PA: The Science Press. 

ROTHENBERG, A. [1979]: The emerging goddess. Chicago: University of Chicago Press. 

SCHUMACHER, E. F. [1977]: A guide for the perplexed. New York, NY: Harper and Row. 


Brit. J. Phil. Sci 40 (1989), 365-367 Printed in Great Britain 


An Anomaly in the D-N Model of 
Explanation 


ALEX BLUM 


ABSTRACT 


It is argued that the constraints placed on the non-law premisses of a D-N 
explanation are irrelevant to their function and will not salvage the deductive 
requirement from triviality. 


A true law-like sentence (or conjunction of them) L, explains a true sentence E, 
on the Hempel-Oppenheim deductive-nomological account of explanation 
(henceforth 'D—N account’) only if: 


(x) L 
C/.'.E 


is sound, where C is a statement (or conjunction of statements) of ‘antecedent 
conditions’ + 

The role of C at the inception of the D-N account is auxilary to that of L. So 
that while in a bona fide explanation L may imply E, C may not. For as Hempel 
writes ‘the explanation of a phenomenon, we noted, consists in its subsump- 
tion under laws or under a theory'? (The function of C presumably being 
limited to subsumption.) That is, C is to link L to E.? 

It was, however, soon realized that unless some constraints are imposed on 
C, the backbone of the D-N account, i.e., that x be sound ts trivially satisfied by 
any L and E.* For any given two true sentences p and q, there is a sound non- 
question begging argument from one to the other, Le.: 


($) p 
Tpoq/..q 


1 See Hempel and Oppenheim [1948] and Hempel [1964, 1965a]. It would appear that C ts 
required to be a closed first order sentence with no quantifiers. For the sake of clarity we ignore 
this requirement till the end of our paper. But see note 5. 

? Hempel and Oppenhetm [1948]: 264. 

3 Thus, in criticism. '. . . then, as a consequence... any given particular fact could be explained by 
means of any true lawlike sentence whatsoever’. Ibid., p. 276. 

* Thid., pp. 276-8. 


366 Alex Blum 


Hence C cannot be equivalent to "L>E",5 nor, for that matter, should it be 
possible to transform C into MR" (LE)! where R does not imply TL > C7.® 

C does imply "LE". In fact 'L 2E! is the common content of all the C's 
which jointly with L yields E. So that the requirement that C not be equivalent 
to "IL 2E! or to 'R*(L2E! where R does not Imply "LE! is in effect a demand 
that C contain information irrelevant to the linking of L to E.” That no criticism 
of this requirement was based on it being irrelevant to the linking function of C 
was In no doubt due to the ineffectiveness of the requirement in saving the 
soundness of (x) condition from triviality. The further stipulation that the 
Irrelevant information in C not be expressible as a conjunct of C, was already 
felt by some to be ad hoc.’ The artificiality in the restriction may be brought out 
as follows. 

Suppose there is a true singular sentence e stronger than E in the relevant 
manner. Le., E does not imply e, and ife implies "R -E! then R Implies E. Let L’ be 
a singular instance of L and let 'e--E' be a rule of inference which allows the 
drawing of an E sentence from its corresponding e sentence. That is, if e is ‘a 
melted at noon’, E may be ‘a melted’. It may very well be that for every E there is 
a corresponding e. Be that as it may, we have that any L explains any E for 
which there is an e. For 


(y) L 
Titel ..E 
is sound.? 
Bar-Ilan University 
52100 Ramat-Gan 
Israel 


5 Or L >F (throughout), where L’ is an Instance or conjuction of instances of L. See below. 

§ See Kim [1963]: 288-9 and Hempel [1964]: 294—5. 

7 And furthermore, as Ackermann [1965: 165] points out, it does not rule out the case where E is 
replaced by "EvP! where P ls a sentence independent of C. 

8 See Hempel [1964]: 295; and Ackermann [1965]: 160. 

? The disallowance of assuming the truths of E in choosing C, as pointed out by Scriven [1959]: 
468-9, [1962]: 181-7 and endorsed by Hempel, C. [1965]: 370—1, is not a viable option. Nor 
for that matter must the truth of e be disallowed. Our problem is not whether but why E is true. 
On this last point see Nozick [1981]: 118-20. 


REFERENCES 


ACKERMANN, R. [1965]: ‘Discussion: Deductive Scientific Explanation’. Philosophy of 
Science, pp. 32, 155-67. 
HEMPEL, C. [1964]: ‘Postscript (1964) to “Studies in the Logic of Explanation" ’ 
reprinted in Hempel [1965]: pp. 291-5. 
. HEMPEL, C. [1965]: Aspects of Scientific Explanation And Other Essays in the Philosophy of 
Science. New York: The Free Press. 


An Anomaly in the D-N Model of Explanation 367 


HEMPEL, C. [1965a]: ‘Aspects of Scientific Explanation’ in Hempel [1965]: pp. 331- 
496. 

HEMPEL, C. and OPPENHEIM, P. [1948]: ‘Studies in the Logic of Explanation’ reprinted in 
Hempel [1965]: pp. 291-5. 

Km, J. [1963]: ‘Discussion: On The Logical Conditions of Deductive Explanation’. 
Philosophy of Science, 30: pp. 286-91. 

Nozick, R. [1981]: Philosophical Explanations. Cambridge, Massachusetts: The Belknap 
Press of the Harvard University Press. 

Scarven, M. [1959]: “Truisms as the Grounds for Historical Explanation’, in P. Gardener 
(ed.) Theortes of History, pp. 443-75. New York: The Free Press. 

Scriven, M. [1962]: ‘New Issues in the Logic of Explanation’, tn S. Hook (ed.), [1963]: 
Philosophy and History, pp. 339-61. New York: New York University Press, 


Brit. J. Phil. Sci. 40 (1989), 369-375 Printed in Great Britain 


| DISCUSSION 
If It Ain't Broke, Don't Fix It 


In a recent essay in this Journal, ‘The Value of a Fixed Methodology," John 
Worrall has taken me to task for having claimed in my Science and Values 
[1984], that not only have the theories of science changed through time but so 
alike have the methods and aims of science. He and I disagree both about the 
factual claim that the methods have shifted and about our appraisals of the 
philosophical significance of such shifts. This dispute goes well beyond 
differences between Worrall and me about historical and philosophical 
matters. Itis one ofthe two or three central issues which divide philosophers in 
the so-called historical school, both from other camps of philosophy and 
amongst themselves. This debate pits Popper against Kuhn, Lakatos against 
Toulmin, McMullin against Shapere, and Worrall against me. Because the 
topic is thus of relatively broad interest and provenance, I want to respond 
briefly to some of Worrall’s criticisms of my work on this score. 

(1) Worrall is disturbed by the prospect that the methods, alms and 
standards of the scientific enterprise might change through time. I shall, in due 
course, present some reasons to think not only that they do change, but that 
something would be very bizarre about the scientific enterprise if they did not. 
However, before I deal with that question, there is a prior issue which must be 
grappled with. Worrall makes it vividly clear that what really frightens him 
about the prospect of changes in scientific rationality is that such changes, in 
his view, open the floodgates to relativism. As he puts it at one point: ‘If no 
principles of evaluation stay fixed, then there is no ‘objective viewpoint’ from 
which we can show that progress occurred . . . However this is dressed up, it is 
relativism’. (p. 274) Again, he says that ‘without such an (‘invariant core of 
methodological principles’) the model (viz., Laudan’s) collapses into relativ- 
ism’. (p. 275) And early on in his essay, he insisted that ‘laying down fixed 
principles of scientific theory-appraisal is the only alternative to relativism’. 
(p. 265). The initial point I wish to make is that Worrall has wholly 
misconstrued the threat from relativism. The central claim of the epistemic 
relativist, at least where standards and methods are concerned, is not that 
those standards change but that—whether changing or unchanging—those 
standards have no independent, non-question begging rationale or founda- 
tion. Even if man had been using exactly the same inferential principles ever 


! British Journal for the Philosophy of Science, 39 (1988), pp. 263-275. 


370 Larry Laudan 


since the dawn of science, the relativist would doubtless ask, and properly so, 
‘What is their justification?’ 

I believe that there is an answer to the relativist's challenge to show how 
methodological or epistemic principles can be justified; indeed, much of Science 
and Values was an attempt to sketch out one such response. But the central 
point I want to make tn this opening section is that the challenge of relativism 
is exactly the same whether the methods of science are one or many, constant 
or evolving. If we can answer that challenge, i.e., if we can show why certain 
methods are better than others, then we can offer a justification for the current 
methods of science, even if they are different from the methods of science of 
three centuries ago. If, on the other hand, we cannot resolve the relatlvist's 
meta-philosophical conundrum, then it wil be wholly beside the point 
whether methods are constant or changing. Worrall's insistence that an 
acknowledgment that the methods of science might change is what greases 
the slope to relativism is a symptom of a deeper failure to realize that we are 
facing a significant meta-epistemological problem—one that is equally acute 
whether the methods of science have changed or have remained always the 
same. Sporting bumper stickers proclaiming that ‘scientists always do it the 
same way' is a laughably feeble response to the relativist's demand. 

Ithus categorically reject the suggestion that the thesis that the methods of 
Science change in itself gives aid and comfort to relativism. What does give 
comfort to relativism Is a failure to address the question: ‘How are method- 
ological rules or standards justified?’ I have claimed elsewhere, and will repeat 
it here, that most of those philosophers whom Worrall sets out to defend in his 
essay (e.g. Popper Carnap, Hempel and Reichenbach) opened themselves up to 
the relativist challenge in a particularly vivid form by espousing a view about 
the aims and methods of science which is through-and-through relativist in 
character. Popper, for instance, repeatedly says that the methods of science are 
nothing but ‘conventions’.? He likewise says that, provided a set of scientific 
aims or standards is internally consistent, it cannot be philosophically 
criticized (which is tantamount to saying that every consistent set of aim- 
theoretic proposals for science is equally kosher philosophically). Reichenbach 
was equally cavalier about choosing between rival sets of aims for science. If in 
doubt, read the opening chapter of Experience and Prediction, where Reichen- 
bach says that the matter of selecting aims for science is a ‘volitional decision’: 
if—and this is Reichenbach’s own example—someone says that the aim of 
science is to make people happy rather than to find out the truth about the 
world, then we may disagree with him but there is nothing we can do to fault 
his proposal? Carnap took a similarly subjectivist view about the aims of. 


? Even Popper's erstwhile disciple Lakatos was willing to concede that: ‘Popper never offered a 
theory of rational criticism of consistent conventions.’ (I. Lakatos, The Methodology of Scientific 
Research Programmes (Cambridge: Cambridge University Press), p. 144.) 

3 See II. Reichenbach, Experience and Prediction (Chicago: University of Chicago Press, [1938]), 
especially pp. 10-13. 


If It Ain't Broke, Don't Fix It 37I 


inquiry. And Hempel's opus (prior to this decade) can be searched in vain for 
any serious discussion of the status of methodological rules. It is for such 
reasons that] claim that none of these figures has an even prima facie plausible 
story to tell about how the aims or methods of science, whether fixed or 
changing, can be justified.* 

Clearly, if the mainstream tradition in philosophy of science preaches that 
the methods of science are conventions, and that the aims of science are largely 
matters of personal preference, it does not take much agility to find therein the 
makings for a thick relativist stew. Indeed, as I have tried to show in detail 
elsewhere," the core ingredients of contemporary epistemic relativism are 
there for the taking in the muddle-headed meta-epistemology of the logical 
positivists and the logical empiricists. If, with Worrall, we want to give a 
convincing solution to the meta-methodological conundrum (and I trust that 
he and I are of one mind in that regard), then it is to no avail to dig in our heels 
and say that 'everything's okay as long as the aims and methods of science 
don't change'. 

Such an attitude carries the added liability that it runs inconvenlently 
counter to the historical facts, as I shall now try to show. 

(2) For more than a decade, I have been arguing that the evaluative 
principles and methods utilized by scientists change through time.5 In 1980, I 
published a book, Sclence and Hypothesis, which sought to document in detail 
some of the shifts that have occurred in the methodology of science between 
the 17th and the 20th centuries. Similar sentiments can be found in the work 
of Shapere and Toulmin. Worrall by contrast, like Popper and Lakatos before 
him, holds that the methods of science have changed not at all. In response to 
evidence of the sort I presented in Science and Hypothesis, Worrall concedes that 


+ Reichenbach's attempt at a pragmatic justification of induction is as close as anyone in this 
group comes to spelling out how the methods of science might be appraised. Unfortunately, his 
project does not succeed. But, against the background I have just described, he must be given 
high marks for recognixing the importance of the problem. 

5 See my ' "The Sins of the Fathers ...": Posttivist Origins of Post-Posttivist Relativisms,’ in 
W. Savage, ed., Beyond the Positivist Consensus: Five Philosophers on the Philosophy of Science. 
Westview Press, forthcoming. 

© It is important for the record to stress that I have not claimed that no methodological principles 
have remained Invariant over the course of science (say since the 17th century). What I have 
shown is that some rather central methodological principles have been abandoned or 
significantly altered over the course of time. Moreover, I have clatmed that I can see no grounds 
for holding any particular methodological rule—and certainly none with much punch or 
specificity to it—to be in principle immune from revision as we learn more about how to conduct 
inquiry. Where Worrall sees certain methodological principles (the exact ones are generally left 
unspectfled by him) as constitutive of all sclence—past, present and future—I am reluctant to 
embrace such a priorism; especially in the face of the fact that many of the methodological 
principles formerly regarded as sacrosanct have been happily abandoned (e.g., the principle that 
one event can be the cause of another only ff it invariably accompanies the other—which was at 
the heart of most ‘experimental researches’ prior to the mid-19th century). 


372 Larry Laudan 


the methodological principles espoused by scientists have changed but he is 
reluctant to take scientists’ claims about their methods at face value. Evincing 
strong scepticism about scientists’ explicit self-reflections, he appears to share 
Lakatos’ view that scientists, including great scientists, chronically suffer from 
methodological ‘false consciousness’ about what they do and why they do it. 
(It has always been unclear to me how Lakatos and Worrall can plausibly 
maintain both that scientists’ implicit judgments about theories and evidence 
are virtually never wrong and that their explicit accounts of their reasons for 
their theory preferences are virtually never right.) Confronted with monumen- 
tal evidence that scientists’ pronouncements about their methods change 
dramatically from epoch to epoch, these defenders of a Parmenidean view of 
scientific rationality are forced to suppose that scientists are Koestlerlan sleep- 
walkers, stumbling from discovery to discovery, reduced to incoherence and 
self-delusion whenever they attempt to describe what they are doing. Apart 
from the monumental psychological tmplausibility of supposing that great 
scientists never really understand what they are doing (but that we 
philosophers do), I must confess to finding it rather uncharitable to suppose 
that scientists’ explicit pronouncements about their principles of inference and 
experimental design are uniformly wide-of-the-mark. But—purely for the sake 
of argument—I am willing to meet Worrall half-way by looking at what 
scientists do rather than at what they say about what they do. Worrall calls 
this distinction the difference between ‘explicit’ and ‘implicit’ methods or 
standards. 

Worrall believes that there is a set of ‘implicit methodological standards’ that 
scientists have ‘in fact always applied’. (p. 267) I have argued the other side of 
the case. I have claimed that there are many methodological standards implicit 
in 20th-century scientific research which were not always there. Examples I 
have given include blinded experiments (which emerged only in the 1930s) 
and controlled experiments (which became the norm only in the late 19th 
century). Other equally obvious examples of new forms of scientific reasoning 
involve the use of sophisticated statistical techniques for the analysis of data 
and the design of clinical trials. 

What does Worrall say in response to such cases? He forthrightly concedes 
that methods such as these have shifted significantly over the course of 
science. But, says he, such innovations in the methodology of sclence 
depended upon ‘substantive discoveries’ about the structure of the world (such 
as the discovery of the placebo and the expectation effects). (p. 274) He is surely 
right about this much. Many of the methodological procedures implicit (and 
explicit) in contemporary science rest upon our having discovered certain 
things about this world which have to be guarded against, or otherwise dealt 
with, in our theories about how to interact with the world. But Worrall thinks 
it important to distinguish such substantive methodological principles, which 

he views as ultimately derivative and secondary, from ‘the unchanging, abstract 


If It Ain't Broke, Don't Fix It 373 


formal principles of good science’.” These latter principles are supposedly 
independent of the vicissitudes of what we come to learn about how the world 
is constituted; in legal parlance, they are strictly procedural, not substantive. 
(If they were substantive, and thus dependent upon our theories about how 
the world is constituted, Worrall’s claim that these principles are permanent 
would be transparently unconvincing, given the rapidity with which funda- 
mental scientific theory changes.) So the issue now before us comes down to 
this: Are there any purely procedural (viz., non-substantive) principles of 
scientific inference and, if so, have those remained fixed through time? 

.But before I turn to deal with that issue, it is worth reminding ourselves of 
the circultous route we have been obliged to follow. In response to the initial 
claim that the methods of science have changed, Worrall conceded the point 
with respect to the explicit methodology of science but insisted that the implicit 
methodology of science was all of a piece. In response to my claim that many of 
the implicit methods of science have changed (e.g., principles and protocols of 
experimental design), Worrall concedes that implicit methods which rest on 
substantive beliefs about the natural world have changed but still stakes his 
Parmenidean case on the constancy of the ‘formal principles of good science’. 
(fI were a Lakatosian, I would be sorely tempted to begin muttering under my 
breath about ‘degenerating problem shifts’; since I am not, I shall resist the 
temptation.) 

(3) What are these formal principles, which are wholly procedural and thus 
not subject to the shifting sands of our theoretical and factual beliefs? Worrall 
cites but one example in his paper; a principle to the effect that ‘theories should, 
whenever possible, be tested against plausible rivals’. (p. 274) Now I hold, as 
Worrall does, that this is a splendid methodological principle. But unlike 
Worrall, I think it is neither strictly procedural nor a principle implicit in all of 
great sclence since earliest times. 

First, a few remarks as to its allegedly purely procedural status. I take it that 
one essential feature of a genuinely procedural, as opposed to a substantive, 
rule is that the former makes no concessions to the particular world we happen 
to inhabit and rests upon no (possibly revisable) assumptions about how that 
world is constituted. A procedural rule is, as philosophers used to say, a rule 
which applies in all possible worlds. Is Worrall's testing rule of that sort? 
Suppose we lived in a world very different from this one, a world in which there 
were only a finite number of objects, a finite number of space-time points, etc. 
Suppose in that world that we were entertaining the hypothesis that ‘All 
swans are white' and that we had managed fully to survey the swan 
population and determined that each was indeed white. Now, in such a world, 
the injunction to test our hypothesis against its ‘plausible rivals’ ts wholly 
gratuitous, since we can deduce our hypothesis from the evidence. Unfortu- 
nately, of course, we do not have any reason to believe that our world is 

7 Worrall's emphasis. c uM 





374 Larry Laudan 


relevantly like the one I just described. We believe that the class of objects 
falling under the swan hypothesis is multiply infinite. It is because we believe 
that, that we also believe that we cannot deduce the swan hypothesis from any 
finite range of particulars; and it is because of that latter belief that we talk 
about ‘testing’ at all, let alone testing ‘against plausible rivals’. 

The general point is that all principles of theory evaluation make some 
substantive assumptions about the structure of the world we live tn and about 
us as thinking, sentient beings. The difference between procedural and 
substantive methodological rules is thus entirely a matter of degree and of 
context. And as soon as we acknowledge that point, it becomes clear that the 
cogency of any methodological principle is, at least in part, hostage to the 
vicissitudes of our future interactions with the natural world. But that is just 
another way of saying that methodologies and theories of knowledge are 
precisely that, theories. Specifically, our methodological rules represent our 
best guesses about how to put questions to nature and about how to evaluate 
nature's responses. Like any theory, they are in principle defeasible. And like 
most theories, they get modified through the course of time. 

And so they should, for it would be singular, would it not, if—after several 
thousand years of interrogating nature—we had not managed to learn that 
some techniques of interrogation that initially looked plausible failed ultima- 
tely to be appropriate and that other techniques of interrogation, which had 
not even occurred to our forbears, have proved quite effective? Put differently, 
why should one suppose that scientists (as Worrall would be the first to 
concede) routinely change thelr beliefs about the constitution of the natural 
world but that they never change their important beliefs or their practices (but 
only their rhetoric) concerning the evaluation of theories, the design of 
experiments and the analysis of data? 

I claimed earlier that Worrall's principle about testing does not enjoy the 
functionally a priori status he accords it (by deeming it formal and procedural) 
and that it is not a principle always to be found in past examples of great 
scientific practice. I should comment briefly on this latter claim. Ponder the 
history of rational mechanics between say Wren and Wallis, on the one end, 
and Euler and D’Alembert on the other. This surely counts as a pretty major 
episode in the development of scientific thought, beginning as it does with the 
first coherent formulation of the laws of elastic collision and terminating with 
the esoterica of Eulerian analytic mechanics. 

If one peruses the classic, early papers on collision by Wren and Wallis (both 
published in [1669])—papers on which Newton drew heavily in Principia— 
one looks in vain there for any reported observation, let alone what we would 
regard as an experiment. Yet it is those essays which won over the scientific 
community to the principle of the conservation of momentum. Now, absent 
the citing of any empirical data by these physicists, I suppose it is fair to 
conclude that they were not conducting a 'test', at least not in any sense that 


374 Larry Laudan 


relevantly like the one I just described. We believe that the class of objects 
falling under the swan hypothesis is multiply infinite. It is because we believe 
that, that we also believe that we cannot deduce the swan hypothesis from any 
finite range of particulars; and it is because of that latter belief that we talk 
about ‘testing’ at all, let alone testing ‘against plausible rivals’. 

The general point is that all principles of theory evaluation make some 
substantive assumptions about the structure of the world we live tn and about 
us as thinking, sentient beings. The difference between procedural and 
substantive methodological rules is thus entirely a matter of degree and of 
context. And as soon as we acknowledge that point, it becomes clear that the 
cogency of any methodological principle is, at least in part, hostage to the 
vicissitudes of our future interactions with the natural world. But that is just 
another way of saying that methodologies and theories of knowledge are 
precisely that, theories. Specifically, our methodological rules represent our 
best guesses about how to put questions to nature and about how to evaluate 
nature's responses. Like any theory, they are in principle defeasible. And like 
most theories, they get modified through the course of time. 

And so they should, for it would be singular, would it not, if—after several 
thousand years of interrogating nature—we had not managed to learn that 
some techniques of interrogation that initially looked plausible failed ultima- 
tely to be appropriate and that other techniques of interrogation, which had 
not even occurred to our forbears, have proved quite effective? Put differently, 
why should one suppose that scientists (as Worrall would be the first to 
concede) routinely change thelr beliefs about the constitution of the natural 
world but that they never change their important beliefs or their practices (but 
only their rhetoric) concerning the evaluation of theories, the design of 
experiments and the analysis of data? 

I claimed earlier that Worrall's principle about testing does not enjoy the 
functionally a priori status he accords it (by deeming it formal and procedural) 
and that it is not a principle always to be found in past examples of great 
scientific practice. I should comment briefly on this latter claim. Ponder the 
history of rational mechanics between say Wren and Wallis, on the one end, 
and Euler and D’Alembert on the other. This surely counts as a pretty major 
episode in the development of scientific thought, beginning as it does with the 
first coherent formulation of the laws of elastic collision and terminating with 
the esoterica of Eulerian analytic mechanics. 

If one peruses the classic, early papers on collision by Wren and Wallis (both 
published in [1669])—papers on which Newton drew heavily in Principia— 
one looks in vain there for any reported observation, let alone what we would 
regard as an experiment. Yet it is those essays which won over the scientific 
community to the principle of the conservation of momentum. Now, absent 
the citing of any empirical data by these physicists, I suppose it is fair to 
conclude that they were not conducting a 'test', at least not in any sense that 


Brit. J. Phil. Sci. 40 (1989), 376-388 Printed in Great Britain 


DISCUSSION 
Fix it and be Damned: A Reply to Laudan 


Larry Laudan has made a series of important studies of the historical 
interaction of science and philosophy of science—many of them collected in 
his Science and Hypothesis ((1981]). He holds that these studies together with 
the work of others in the ‘historical approach’ (such as Stephen Toulmin and 
Dudley Shapere) unambiguously show that scientific change is not restricted 
to the level of accepted general theories. Instead changes have also occurred 
both in the methods and aims of science. In his Science and Values ({1984}), 
Laudan argued that these changes in methodology and ‘axiology’ are 
inconsistent with the ‘older’ empiricist approach to philosophy of science, but 
that they do not thereby force the acceptance of ‘big picture relativism’. Instead 
Laudan's own ‘reticulated model’ shows how, by piecemeal and rational 
modifications, change can spread through all levels of scientific commitment. 

The obvious worry with any such claim concerns how changes can be 
explained as rational if even the basic principles of rationality themselves are 
subject to change. And indeed the main thrust of my review of Science and 
Values (this Journal, 39, 1988, 263-75) was that Laudan’s ‘reticulated model’ 
is not in fact a genuine third alternative. Instead it either collapses into 
relativism or, because implicitly committed to an unchanging core of 
methodological appraisal principles, amounts to an (interesting) elaboration of 
the 'older' approach. I want to explain here briefly why Larry Laudan's reply 
[above pp. 369-75] has not led to a change of mind (and also to respond to 
some of Laudan's criticisms of my own position). 

Has methodology changed alongside changes in substantive theoretical 
clatms? Unsurprisingly, it depends on what is meant by ‘methodology’. Is it 
part of the ‘methodology’ of present medicine that clinical trials are to be 
performed ‘double blind’? If so, then methodology has changed and changed, 
just as Laudan insists, in the light of substantive scientific discoveries (the 
placebo effect). Is it—at a more general level—part of the *methodology' of 
present day physics that theories should be mathematically expressed? If so, 
then methodology has changed—at any rate since Aristotle. Was it part of the 
‘methodology’ of 18th and 19th century physics that all theories be 
deterministic and ascribe sharp values at all times to all quantities invoked? 
These assumptions certainly seem to have operated in that period not only as 
substantive metaphysical claims within already accepted theories but also 
(usually implicitly) as positive heuristic principles guiding the construction of 


Fix it and be Damned: A Reply to Laudan 377 


new theories. If that assumption does count as methodological then the 
quantum revolution—that is, a substantive scientific breakthrough—again 
induced important methodological changes. 

I would not argue that any of these usages overstretch the elastic term 
‘methodology’. Moreover, I agree with Laudan that it would be ‘bizarre’ if 
changes had not occurred in this extended ‘methodological’ domain, if we had 
notin some sense or other learned how to do science better alongside doing better 
science. But how exactly can the new methods be judged 'better' than the old? 
Laudan explicitly seeks a system which will deliver this judgement and 
explicitly accepts that a system which fails to deliver it entails relativism. I 
claim that the judgement can be delivered only if some core principles (of an 
abstract and general kind) are considered as fixed, as constituting rationality. 
These principles will include the basic tenets of deductive logic and intuitive 
rules for weighing evidence (especially the principle that special weight is to be 
given to a theory's predictive success?). It Is methodology in this much more 
restricted, core sense that I claim is fixed and must be fixed if relativism is to be 
avoided. 

Laudan is, then, wrong that ‘He and I disagree . . . about the factual claim 
that the methods [of science] have shifted’ [above, p. 369]. If these methods are 
construed in his broad sense then no one could deny that they have shifted 
historically. He, on the other hand, makes it clear in his reply that it would be 
wrong to land him with the view that 'no methodological principles have 
remained invariant over the course of science (say since the 17th century)' 
[above, n. 6]. Our basic disagreement is this. Laudan [ibid.] 'can see no grounds 
for holding any particular methodological rule—and certainly none with 
much punch or specificity to it—to be in principle immune from revision as we 
learn more about how to conduct inquiry'. Whereas it seems to me clear that 
in order to make sense of the claim that we 'learn more' about how to conduct 
Inquiry, some core evaluative principles must be taken as fixed. (Given that 
these principles are intended to be of great generality, they are not going to 
have much specific ‘punch’—but are nonetheless punchy enough to ground 
scientific rationality.?) 


! For more details on this dual role of some statements in science see my [198 5a]. That paper is in 
part a criticism of Shapere's attempt—in some respects mirroring Laudan's—to do without 
fixed 'presuppositions' In methodology. (See, e.g. Shapere [1984].) 

2 The important sense of ‘predictive success’ allows that a theory may perfectly well predict an 
already known fact. See my [1985] and [1989]. 

3 These core principles must indeed be considered as very general and intuitive otherwise (as 
Clark Glymour urged—personal communication) the ‘fixed methodology’ thesis would be 
refuted by the case of statistical methodology. It is surely true that when statistical-probabilistic 
hypotheses were introduced into science they produced new and taxing specific methodologt- 
cal problems. The ‘fixed corist’ has to argue that the rules for appraising statistical hypotheses 
In the light of evidence are the result of applying general (and previously held) tntuitions to 
these new and taxing cases. Without claiming to be able to argue this tn detail here, I should 


378 John Worrall 


Can Laudan’s ‘reticulated model’ explain scientific change as rational 
without invoking some fixed principles of logic and weighing evidence? His 
basic idea sounds very attractive: each part of our knowledge—substantive, 
methodological, and 'axiological'-s in principle revisable; but wholesale 
change never occurs; instead changes in one part of the ‘ship of knowledge’ are 
based on, justified by, the temporarily fixed other parts. Temporarily accepted 
theoretical knowledge may account for changes in methodology no less than 
temporarily accepted methodological knowledge may account for changes in 
theories. Thus, to take the principal historical example from Laudan's earlier 
work, the acceptance of the 'classical' wave theory of light (which invoked a 
highly 'theoretical entity'—the luniniferous ether) led to the rejection of the 
Newtonian inductivist methodology (which according to Laudan anathema- 
tized all theoretical entities) and to the replacement of that methodology by a 
more 'liberal' hypothetico-deductivism. This new methodology then went on 
to sanction the switch to theories which arose and were still better than the 
wave theory (‘better’ according to its own canons of course). 

But however attractive the view may appear on the surface it surely fails to 
withstand deeper analysis. The general faults can be gleaned from the 
difficulties involved in Laudan's historical example. First, I believe that the 
claim that Newtonian ‘inductivist’ methodology discouraged theoretical 
speculation of all forms is factually false (after all it clearly sanctioned the 
gravitational field). But the central problem here is logical. Suppose it were true 
that inductivism was generally accepted in 18th and very early 19th century 
science and that it really banned all genuinely observation-transcendent 
theoretical entities. How then could the classical wave theory with its 
luminiferous ether (the archetypal theoretical entity) ever have been accepted? 
Only, it seems, by contravening the methodology then in force. Of course, gtven 
the (on this picture) irrational decision to accept the wave theory, then the 
eventual rejection of the methodological ban on theoretical entities can easily 
be explained—you cannot have accepted theories which clearly embody what 
your methodology tells you are supreme vices, and entities do not come any 
more ‘theoretical’ than the luminiferous ether. But then the relativist (again as 
I understand his position) does not hold that reasoning plays no role in 
science—that would surely be absurd—but ‘only’ that an essential part is 
played by unreason. 

It might be objected that the above argument takes a very naive view of 
Newtontan inductivism in particular and of methodology tn general. (Leplin, 


point out that it Is perhaps less implaustble than might at first appear. How otherwise could 
there be an argument over which rules of evidence in statistics are correct? Without such 
general intuitions against which to test proposed statistical principles, statistics would seem to 
be a conventional matter, a matter of defining a new game. There has certainly been a good 
deal of argument over precisely the question which rules of statistical inference are correct. 


Fix it and be Damned: A Reply to Laudan 379 


personal communication.) Like all methodologies, Newtonian inductivism 
was a complicated affair involving subtly interconnected principles. It did not 
ban theoretical entities, but only discouraged theories which involved them. The 
methodology involved other principles and criteria: the wave theory no doubt 
scored highly on these. Hence the acceptance of this theory could be rationally 
explained; and, once accepted, the theory’s clear involvement of theoretical 
entities could rationally bring about a reappraisal of even a principle which 
merely discouraged such entities rather than banning them outright. 

This certainly lends the account greater historical verisimilitude. But what 
exactly is now being claimed? Presumably the older methodology here, no 
matter how subtle, could be spelled out. Would not the spelled-out form involve 
something like the principle that science should not accept theories that 
involve otiose theoretical entities—that it should accept theories involving 
highly theoretical entities only if there is some pay off for this in terms of 
increased (independent) empirical support? (As William Whewell pointed out, 
this is really the content, or at any rate the central content, of Newton’s famous 
vera causa principle.) But then of course that principle was not at all challenged 
by the wave revolution (and is indeed still accepted in science). It was precisely 
on the grounds of the stunning empirical predictive success of the elastic solid 
ether theory that scientists such as Airy and Powell accepted the theory. The 
more accurately the episode is described, the closer we get to ‘bottom-line’ 
methodological principles which there ts no historical evidence have ever 
shifted. These bottom-line principles sanctioned the shifts both to new theories 
and to new methodologies in Laudan’s broad sense. 

The case of another favourite example of Laudan’s—the switch to ‘double 
blind methodology’ in clinical trials—is similar. No doubt, at one level of 
analysis, ‘reticulation’ occurred in this case: the placebo hypothesis emerged 
from science operating on the old methods and this eventually led to a revision 
of the methods used in the area. But how rather more precisely could this 
change have occurred? 

Suppose the substantive claim is being entertained that—at any rate in 
some cases and for some people—the beliefs of those taking a drug and of those 
dispensing it play a causal role in symptom relief; but further suppose that the 
‘old methods’ are still in force—that is there is not yet any methodological 
requirement that clinical trials be performed double blind. How could the clatm 
now being merely entertained be validated? I can see no way oftelling the story 
except along the following lines. 

The problem ts to distinguish the extra efect of the ‘characteristic factors’ of 
the particular drug therapy at issue over the effects of its incidental features 
(such as the beliefs engendered in the patients in its likely characteristic effect) 
which can be expected to be common to a range of different therapies. In order 
to arrive at a legitimate view of the (likely) excess effect here then we clearly 
need to compare the response to treatment in several experimental groups— 


380 John Worrall 


for example one in which the experimentees are indeed being given the drug 
and one in which they believe they are but are in fact being given a substance 
which other (presumably well-established) theories tell us have themselves no 
direct ‘characteristic’ effect on the condition concerned. Further, since any 
expectations the administering physician may have about the likely effect of 
the drug are (if operative at all) another tncidental feature of the therapy, the 
extra effect of the characteristic features cannot be accurately gauged unless 
these expectations are controlled for. A fairly obvious suggestion for instituting 
such a control is that some method of conducting the trial be devised so that 
the physician does not know which substance (‘active’ drug or ‘placebo’) is 
being administered to any particular patient.* 

But then, while it may issue in a new method, this whole argument is clearly 
underpinned (and, 1f reasonable, must be underpinned) by unquestioned 
assumptions about proper general, scientific methodology. Double-blind 
‘methodology’ emerges as a particular application to a particular type of 
knowledge-situation of a general ‘core’ methodological rule. There is no more 
(and also no less) reason to talk of a methodological change in this case than 
there is in any case in which we have attributed some experimental effect to a 
particular factor but then newly come to suspect that some further factor may 
be playing a role in our experimental results and hence that we need to 'shield' 
our experiments against it. (For example, that we need to shield experiments 
from possible electromagnetic effects when we are testing gravitational 
hypotheses. With an unchanging core methodology which tmplies that 
greater empirical support can legitimately be clatmed for the hypothesis that a 
particular factor caused some effect if the experiment testing the hypothesis 
has been 'shielded against' other possible causal factors, then the double blind 
episode is easily explained as rational; without such an unchanging core, I can 
see no such explanation.? 

Laudan argues in his reply, however, that it is exactly this sort of ‘fixed core’ 
position which holds ‘the makings for a thick relativist stew’ [above, p. 3]. 
According to Laudan, I have 'entirely misconstrued' the relativistic threat: 
even ff it could be shown that some subset of methodological principles have 


* See Grünbaum [1984] for an especially perceptive and careful analysis of the (as he shows, 
often very obscurely characterized) notions of placebo therapy, placebo control, etc. The term 
‘characteristic factor’ is Grünbaum's: his paper should be consulted for further details. 

5 I would, then, vigorously deny Laudan's claim [above, p. 372] that controlled experiments 
‘became the norm only in the late 19th century’. What, for example, was Galileo doing In 
polishing his inclined planes except ‘controlling’ (of course, as he was aware only imperfectly) 
for the effects of friction? It may be that Laudan has some more specific idea of a controlled 
experiment in mind—perhaps the specific procedure of performing clinical trials and other 
physiological experiments using a ‘control group’? I am not sure when systematic clinical trials 
got started, but surely whenever it was the Idea of using controls was simply an application of 
old and general ideas about good sctentific method. 


Fix it and be Damned: A Reply to Laudan 381 


remained fixed throughout the development of science, this would be a 
‘laughably feeble’ response to relativism [above, p. 370]. Carnap, Reichenbach, 
Popper and co, all of whom adopted this feeble position are, according to 
Laudan, exactly the people who, contrary to their intentions, opened the 
floodgates to relativism. (Indeed it is difficult to think of any rationalist or 
empiricist philosopher, no matter how much of a professed absolutist, who fails 
to encourage relativism on this view of Laudan's. Certainly Descartes, Spinoza, 
Leibniz and Kant would all count as floodgate-openers.) To defend first 
principles or fixed presuppositions is, for Laudan, not to defeat relativism but to 
invite it. To defeat relativism we need ‘to show why certain methods are better 
than others’ and thus ‘offer a justification for the current methods of sclence, 
even if they are different from the methods of science of three centuries ago’ 
[above, p. 370]. 

Laudan thinks his reticulated model supplies such a demonstration and 
hence such a justification. But, to repeat the above point, how exactly? Is the 
'justification' simply that our present methods turn out better when judged from 
our present point of view? But, as Mandy Rice Davis might have said, our present 
point of view ‘would say that, wouldn't it? The question is whether our 
present point of view is right to say that our present methods are better than the 
methods of sclence of three centuries ago. And a positive to answer that 
question requires some principles considered as outside the historical process. 
Laudan later says [p. 375] 'In my view, the history of the empirical sciences 
exhibits continuously increasing sophistication ... about what sort of 
evidence constitutes a test of a theory ...' But what would ground the 
assumption that we have learned more here, as opposed to simply believing 
that we have? What is the basis for the judgement that the empirical sciences 
have become increasingly sophisticated as opposed to degenerately baroque? 
Remarks like these make it seem that Laudan really belongs, without wanting 
to acknowledge it, to the ‘fixed core’ camp. To avoid joining the camp he must 
claim that even these judgements are grounded only within our present 
intellectual framework. But that position is classical historical relativism. Once 
again the ‘third’ position that Laudan seeks is excluded. 

But whether or not he himself adopts it, is Laudan right that even the ‘fixed 
core’ position inevitably leads to relativism? If so, then it would seem that I 
have succeeded only in showing that, one way or the other, relativism is 
inevitable. 

In fact, relativism as Laudan defines it, is inevitable. There is a potential 
infinite regress of justification and this means that ultimately the only way to 
avoid sceptical relativism is to dig in one's heels. How else can the sceptical 
relativist be prevented from forcing us down the regress by always asking for a 
justification of any justification he is given? Among serious people, most 
disagreements (at any rate of a factual nature) are, of course, resolved by 
finding shared standards at some deeper level: someone who starts by holding 


382 * . John Worrall 


that the Biblical creation story is very likely true may be persuaded that in the 
light of deeper level standards of evidence that he shares with you, the story is 
in fact very unlikely to be true. But suppose instead his response to your 
argument is to deny your standards of evidence: he agrees that on present 
sclentific standards his position is untenable but asks ‘what’s so good about 
science'? In the end you must stop the slide down the regress by exerting some 
force of your own. Somewhere along the line you just have to say that here we 
reach axioms and if the sceptic seriously questions them then you can help him 
no further and must simply (and ‘dogmatically’) brand him ‘trrational’. Popper 
(in his [1945], Volume 2, pp. 230-1) was especially clear that the adoption of 
the rational approach cannot itself be rationally justified: 


The rationalist attitude is characterized by the importance it attaches to 
argument and experience. But neither logical argument nor experience can 
establish the rationalist attitude; for only those who are ready to consider 
argument or experience, and who have therefore adopted this attitude already, 
will be impressed by them. ... We have to conclude from this that no rational 
argument will have a rational effect on a man who does not want to adopt a 
rational attitude. . . . But this means that whoever adopts the rationalist attitude 
does so because he has adopted, consciously or unconsciously, some proposal, or 
decision, or belief, or behaviour: an adoption which may be called irrational. 


Laudan cites an earlier remark by Reichenbach to much the same effect [above, 
p. 370]. 

Once openly acknowledged this may indeed be an uncomfortable position 
for a philosopher to take, but uncomfortable or not logic forces it on him. 
Indeed not only must the basic principles of scientific method ultimately be 
adopted dogmatically, so must those of deductive logic—as Frege, and following 
him Russell and Wittgenstein all clearly saw. The point was sharply 
emphasized ahead of Frege by Lewis Carroll. Suppose, as Carroll did in his 
famous dialogue between Achilles and the Tortoise, that someone accepts that 
p and accepts that pq but refuses to accept q. One might try to convince him 
as follows: ‘modus ponens in general is truth-transmitting—f p is true then if 
p—qis true then q must be true; here p and p—g are both true so you must infer 
q’. But clearly this ts hardly likely to convince: if someone really refuses to infer 
q from p and p—g, then it will not be surprising if he further refuses to infer that 


$ Popper was later tempted away from this (in my opinion correct) position by Bartley's 
development in his [1962] of ‘comprehensively critical rattonalism’. The basic idea of CCR was 
that even the claim that everything is open to criticism Ís itself open to criticism, and that 
therefore Popper's critical rationality (unlike its 'Justificationist' predecessors) was rational by 
its own lights. But although this may sound appealing, it evaporates under scrutiny. Of course, 
critical rationality can be faced with all sorts of criticigms—4t fails to deliver any certain truths, 
for example. The critical rationalist will dismiss such a criticism as unjustified (though he will 
avoid using the word). But what justified criticisms might there be? And what underpins the 
standards here? 


Fix it and be Damned: A Reply to Laudan 383 


he must infer q from the fact that if an argument is of a valid form and he 
accepts the premises then he must infer the conclusion and the fact that modus 
ponens is a valid form whose premises he accepts in this instance. This latter 
inference clearly itself (doubly) involves modus ponens. Like Russell, I see no way 
out of asserting that we know that modus ponens is (at any rate in clear cut 
cases) truth-preserving and either ‘doubting the sincerity’ of anyone who 
claims to disagree or being ready to brand such a person possibly sincere but 
definitely irrational. 

Of course we are seldom forced to make this admission because we generally 
operate quite happily in real disputes with shared and unarticulated 
background assumptions. But if the sceptic really presses, then the only option 
is, I believe, the honest admission that ultimately we must stop arguing and 
'dogmatically' assert certain basic principles of rationality. If Laudan is right 
that this honest admission entails relativism, then relativism wins. But the 
serious threat surely comes not from someone who simply exploits the infinite 
regress, but from someone who argues that if we attend carefully to the details 
either of logical argument or (more centrally here) to what is done in science 
then we shall find no single set of principles underlying the whole process but 
instead different principles at different times. The serious threat (or at any rate 
the threat that I take seriously) comes not from the creationist or psi-freak or 
whomever claiming that science rests on assertions which ultimately must be 
presupposed by its defenders and that therefore they and he, with his different 
presuppositions, stand on a par. It comes instead from one who argues that, 
even in the enterprise on which his opponents bestow the honorific title ‘science’, the 
underlying principles have changed over time; his own principles therefore, 
while admittedly different from those presently accepted by science, may even 
become the principles accepted by the science of the near future. So why 
should he now give them up? The answer to such a person is precisely to show 
that, on the 'bottom line', only one set of principles of appraisal has ever been 
supposed in that enterprise that seems to all the rest of us (I hope) the paradigm 
of a rational enterprise. 

It might seem that there is an obvious way out of having to defend our basic 
methodological principles dogmatically: they can instead be argued for along 
the following (Lakatosian) lines. We have certain clear-cut intuitions about 
the correctness of certain particular methodological judgements in science and 
about the Incorrectness of other particular judgements in bad or pseudo- 
science. (Along perhaps with a whole range of cases which our intuitions leave 
grey.) General methodological principles can be argued for by showing that 
they yield the correct division between the white and black cases (all the white 
and black cases from the history of science). No doubt, as a matter of 
psychological fact the intuitions about particular cases are prior: it is through 
consideration of what our intuitions tell us about particular cases that we 
come to realize what general methodological principles we implicitly apply. 


384 John Worrall 


But logically speaking this does not make the situation any more comfortable 
for the rationalist. For suppose someone fails to share (or claims not to share) 
our intuitions about particular cases. Suppose someone believes (or claims to 
believe) that intuitively recent creationist ‘science’ is better than Darwinism. 
Unless she shares all (or at any rate the great majority) of our other intuitive 
judgements (that is, unless she implicitly shares our general ‘first principles’ 
and has simply made a mistake in applying them to the particular case), then 
there is again surely nothing to be done save branding her irrational. She is 
playing a different game and, we defenders of science must ‘dogmatically’ 
assert, a worse one. 

Another way of arguing for the fixed core of methodological rules is by 
suggesting how they might have become genetically hard wired as a result of 
natural selection. Those who make the right inductions, those who base their 
actions on generalizations that have enjoyed predictive success had, and have, 
a selective advantage. This argument no doubt needs careful handling. But 
however carefully handled and however persuasive it can be made to seem, it 
is clearly circular. It is based on our belief in the correctness (or essential 
correctness) of Darwinian theory. But this in turn, if rational, is based on our 
methodological principles. 

There seems to me, then, no way of arguing for our basic methodological - 
principles that has any claim to logical priority. Assuming that they do indeed 
lead to the right division between black and white cases, we just assert them 
without argument. As Lakatos used to say (only half-jokingly) there comes a 
point when a rationalist must get out his machine-gun to defend rationality. 


There are two further criticisms in Laudan's reply on which I should like 
briefly to comment. First he takes me to task (along with Lakatos) for doling out 
to scientists amounts of ‘false (methodological) consciousness’ which defy 
reasonable belief. The problem arises from my concession that the sorts of 
explicit methodological pronouncements that scientists are likely to make may 
indeed shift over time, but their basic implicit methodology does not shift. But, 
says Laudan [p. 372], the idea that ‘scientists’ implicit judgements about 
theories and evidence are virtually never wrong [while] their explicit accounts 
of thelr reasons for their theory preferences are virtually never right' is a 
‘monumental psychological implausibility'. 

Now in fact the account I adopt needs a great deal less false consciousness 
than Laudan supposes. As I emphasized above, it fully accepts that there have 
been methodological shifts in the broad sense of methodology—false con- 
sciousness needs to be invoked only where scientists’ pronouncements appear 
to go against the core of very general abstract principles of evidence. And I 
believe, quite contrary to Laudan, that the historical data—properly inter- 
preted—speak to substantial continuity even with respect to professed 
methodology at the core level. 


Fix it and be Damned: A Reply to Laudan 385 


As I hint in my review, I do not believe, for example, that the methodological 
pronouncements of such ‘strict Newtonians’ as Newton himself and Thomas 
Reid are anything like as far out of line as Laudan has claimed (for example tn 
his [1981]) with the allegedly revolutionary ‘hypothetico-deductivism’ of 
Whewell and others. (Laudan does after all face the problem that the 
gravitational fleld—surely no less a theoretical entity than the luminiferous 
ether—was firmly accepted during the period of alleged domination of 
Newtonian inductivism.) I believe that Whewell’s view is more accurately 
represented as a new(ish) gloss on Newtonian ‘inductivism’ than as anything 
like a revolutionary new methodology. 

So my (outline) answer to this criticism is that I agree with Laudan that ifI 
had to dole out ‘false consciousness’ in anything like the amounts he supposes 
then my position would be extremely implausible. Fortunately the amounts 
really required by my account seem to be small indeed. 

Laudan also argues that the position I advocate in opposition to his own 
rests on a distinction that is entirely bogus—the distinction, that is, between 
‘substantive’ and ‘formal’ (or ‘procedural’) principles. I suggest, remember, 
that Laudan’s cases of methodological change are better treated as ones in 
which substantive discoveries are ‘plugged into’ unchanging formal methodo- 
logical principles in the ‘old’ restricted sense to produce ‘new’ ‘methodological’ 
views in Laudan’s wider and undoubtedly substantive sense. But are there, 
Laudan asks, any non-substantive, purely formal methodological principles? 
And if, on the contrary, all methodological rules are underwritten by 
substantive metaphysical assumptions (even if substantive metaphysical 
principles of a very general kind) why should they be in principle unrevisable as 
we discover more about the world? 

I admit here to having been caught out making an illegitimate simplifica- 
tion. It seems natural enough to distinguish princlples—like ‘perform clinical 
trials double blind’—which clearly rest on substantive (and relatively recently 
discovered) assumptions about the world, from more formal, more basic 
principles like ‘test your theories against any plausible rivals that exist’. But 
Larry Laudan is absolutely right that even those principles we are used to 
thinking of as merely formal in fact rely on (in their case very general but 
nonetheless strictly speaking) substantial assumptions about the world. (He 
could even have quoted some earlier publications of mine in his support.") 

For example, scientific procedures at the empirical level and in particular the 
use of empirical generalizations in technological applications surely cannot, 
pace Popper, be explained as rational unless some sort of inductive principle is 
adopted which enjoins the acceptance of appropriate generalizations from 
controlled experiments which have always turned out the same way in a 


7 See, for example, my [1983]. 


386 John Worrall 


sufficient number of cases in the past.? But clearly such an inductive procedure 
is not going to work in all 'possible worlds'. The fact that we assume it will work 
in ours means that our theory of rational acceptance assumes something 
substantive about our world. 

Similarly no adequate methodology for science could, I believe, fail to 
include a basic principle which says that non-ad hoc accounts should always be 
preferred to ad hoc ones (where of course both are available).? Suppose, for 
example, that a classical physicist responded to the difficulties with Mercury’ s 
motion by switching to a theory which says that every body in the universe 
obeys Newton’s laws except for Mercury, and that Mercury moves according 
to some specified empirical generalization ‘read off’ the facts. Although he 
now of course has a theory which is better than his previous one in terms of 
empirical adequacy he surely cannot legitimately claim that the fact that he 
now has a theory that correctly predicts Mercury's orbit means that his theory 
and Einstein's theory stand on a par in respect of empirical support from these 
facts. But any such methodological principle which underwrote this judge- 
ment would clearly have to rest (at any rate for a scientific realist!?) on a 


8 This argument has been urged many times against Popper beginning in the 1930s with 
Reichenbach and Felgl; I re-argue the point iri detail in my [19898]. 

? Many philosophers—including Feyerabend (see his [1975]) and some critics of the account of 
empirical support developed by Zahar and myself (see, e.g. Nickles [1985] and Howson 
[1985])—have misunderstood the strictures against ad hoc hypotheses in Lakatos' and my own 
work as claiming that any attempt to solve a particular problem facing a certain hypothesis is 
illegitimate (this of course would be absurd) or at least as claiming that it is always wrong to 
adopt a hypothesis which is not testable independently of the result it was introduced to 
explain. In fact the only claim is the comparative one that less support accrues to a non- 
independently testable hypothesis. Often enough, however, no non-ad hoc alternative is 
available. If so, it is of course preferrable to adopt that ad hoc, non-independently testable 
hypothesis rather than leave matters as they are. The new theoretical system will save more of 
the facts than its unamended predecessor. Moreover, thet system may eventually be 
augmented so as to become independently testable (and independently confirmed) precisely in 
the area where the ad hoc hypothesis was invoked (and continues to be invoked). For example, 
the only explanation that Fresnel could offer within his overall wave theory for why no 
interference fringes were observed in the area illuminated by two closely adjacent but 
incoherent sources was that there always are objectively interference fringes but that these 
change from one moment to the next with such rapidity as to far outstrip the ability of our 
visual apparatus to record them. When Fresnel first articulated th!s hypothesis it was 
undoubtedly ad hoc not just in the clearly unobjectionable sense of ‘addressed to a particular 
problem’ but in the sense that it had no independent support. Nonetheless it might still be true 
and there was indeed evidence that it was true from the fact that it was the only known 
explanation of the phenomenon that could be given within an overall theoretical system that 
had elsewhere scored very striking empirical success. The hypothesis eventually became 
independently testable tn view of developments in theories of the atomtc constitution of matter 
and of the physiology of vision. 

10 The anti-realist may appear to be better off in this respect: since simplicity. is for him— 
allegedly—e scientific end in itself, not in need of any metaphysical underpinning. But suppose, 
as happens often enough, that a theory correctly accepted according to the canons of evidence 
shared by the realist and anti-realist makes some prediction of some hitherto unobserved 
general effect (a prediction not also made, or perhaps even contradicted, by some rival and 
dispreferred theory). The 'anti-realist', no less than the realist, encourages the view that—at 


Fix it and be Damned: A Reply to Laudan 387 


substantive, synthetic assumption about our universe: to put the point 
figuratively, who says that God did not decide to make an ‘ad hoc’ exception to 
all ‘laws’ of nature? Science simply assumes that he did not. (Feyerabend too 
rightly—and repeatedly—emphasizes this point in his [1975].) 

Laudan is correct: no methodological principle is purely formal. But does it 
follow, as he suggests it does, that every such principle is open to revision in the 
light of further discoveries about the world? I believe that we should resist the 
inference from ‘substantive’ (and therefore ‘strictly fallible’) to ‘seriously 
corrigible’. There is evidence from the history of science of the revisability of 
our ‘methodological principles’ only if these are understood in Laudan’s very 
broad and highly substantive sense. The principles from the narrower domain 
may be substantive, but there is no evidence that the possibility need be taken 
seriously that they might be revised.!! 

JOHN WORRALL 
London School of Economics 


the very least—it is more reasonable to trust this prediction, given tts pedigree, than tt would be 
were It, say, simply plucked out of the blue by some alleged ‘seer’. But this clearly involves the 
presumption of some tie-up between the canons of acceptance and the way the universe Is, or is 
likely to be. In fact both Duhem (with his notion of a ‘natural classification") and Poincare 
explicitly adopt this presumption tn one form or another. (No serious philosopher of science is I 
think properly described as an anti-realist. As I argue m my [1989b] both Duhem and especially 
Polncare—whose names top most people's lists of serious anti-realist instrumentalists—are 
much more accurately described as defending a structural realism.) 

11 Laudan produces some alleged counterexamples to the ‘core’ principle I suggested in my 
review: that it is always a good idea to test our theories against plausible rivals. (In fact I do not 
believe that this principle is quite on the ‘bottom line’ but it ts close enough.) The 
‘counterexamples’ are, however, unconvincing. The principle only says that we should test 
against plausible rivals if there are any. (If—to take one of the alleged counterexamples—there 
were only finitely many swans in the whole history of the untverse and all of them could be 
inspected and they were all white then there would not be any plausible rivals to ‘all swans are 
white'.) Moreover, the principle further does not of course imply that testing against plausible 
rivals is the only way to argue for a theory. Laudan suggests (above, p. 374) that Wallis, Wren 
and others argued for thetr mechanical theories without at all involving tests against rivals. In 
fact, some of Wallis and Wren's arguments can, I think, properly be construed as involving 
tests against rivals (this need not, remember, involve new experiments and the rivals need not 
look plausible once the evidence is in) but even if they were claiming support in some other way, 
this does not trouble the principle. 


REFERENCES 


Banrrey, W. W. [1962]: The Retreat to Commitment. New York: Knopf. 

FEYERABEND, P. [1975]: Against Method. London: New Left Books. 

GRUNBAUM, A. [1984]: 'Explication and Implications of the Placebo Concept’ in G. 
Andersson (ed.), Rationality in Science and Politics. Dordrecht: D. Reidel. (An earlier 
version of this paper appeared in Behaviour Research and Therapy, 19, 1981.) 

Howson, C. [1985]: ‘Bayestanism and Support by Novel Facts’, The British Journal for the 
Philosophy of Science, 35, pp. 245-51. 


388 John Worrall 


LAUDAN, L. [1981]: Science and Hypothesis. Dordrecht: D. Reidel. 

LAUDAN, L. [1984]: Science and Values. University of California Press. 

NicknESs, T. [1985]: ‘Beyond Divorce: Current Status of the Discovery Debate', 
Philosophy of Science, 52, pp. 117-207. 

Popper K. R. [1945]: The Open Society and its Enemies. London: Routledge. 

SHAPERE, D. [1984]: Reason and the Search for Knowledge. Dordrecht: D. Reidel. 

WORARALL, J. [1983]: ‘Sctentific Realism and Scientific Change’, Philosophical Quarterly, 
32, pp. 201-231. 

WoRRALL, J. [1985]: ‘Scientific Discovery and Theory-confirmation’ in J. C. Pitt (ed.), 
Change and Progress in Modern Science. Dordrecht: D. Reidel. 

WonRALL, J. [1985a]: in P. Asquith and P. Kitcher (eds.), "The Background to the 
Forefront', PSA 1984, Volume 2. 

WORRALL, J. [1989]: ‘Fresnel, Poisson and the White Spot: The Role of Successful 
Prediction in the Acceptance of Scientific Theories’ in D. Gooding, T. Pinch and S. 
Schaffer (eds.), The Uses of Experiment—Studies of Experimentation in Natural 
Science. Cambridge University Press. 

WORRALL, J. [1989a]: ‘Why both Popper and Watkins Fail to Solve the Problem of 
Induction’, in F. d'Agostino and L Jarvie (eds.), Freedom and Rationality: Essays in 
Honor of John Watkins. Dordrecht: D. Reidel. 

WORRALL, J. [1989b]: ‘Structural Realism—the Best of Both Worlds?', Dialectica, 43, 
pp. 1-26. 


Brit. J. PhiL Sci. 40 (1989), 389-403 Printed in Great Britain 


DISCUSSION 


Van Rooijen and Mayr versus Popper: 
is the Universe Causally Closed? 


1 Introduction 

2 Van Rooljen's critique of Popper's interactionism 

3 What Popper actually argues 

4 Mayr on whether biology transcends physicalism 

5 How to reject physicalism without being an interactionist 


I INTRODUCTION 


In a recent discussion in this journal (Van Roljen [1987]), Jeroen Van Rooijen 
raises the interesting issue of the logical connection between evolutionary 
theory and the assumption of a causally closed universe. He claims that there is 
no contradiction (p. 91), and I think he is quite right—they could both be false. 
But if there is no contradiction, there might still be an argument against the 
assumption of a causally closed universe, at least partly based upon evolution 
theory. The thrust of Van Rooijen’s discussion is to consider and to reject just 
such an argument, which he ascribes to Sir Karl Popper. But I do not think that 
Van Rooijen is completely successful. Partly this is because one of his 
arguments does not work against its target and partly because he somewhat 
misrepresents Popper’s argument. Part of the aim of this note is to disentangle 
what of Van Rooijen’s case against Popper’s views works from what does not. 
But I have a larger aim in mind also. 

Van Rooijen runs together two rather different doctrines that Popper 
adheres to: interactionism and the causal openness of the physical universe. It 
does not much matter for the purposes of his discussion that these two views 
are different, since they are certainly connected: interactionism implies causal 
openness, hence if evolution theory implies interactionism it implies causal 
openness. But the physical universe could be causally open and interactionism 
false. It is the main aim of thts note to ask whether evolution theory has 
anything to say that would discriminate between these two doctrines. A recent 
discussion of Ernst Mayr's concerning the differences between physics and 
biology (Mayr [1985]), has bearing here. 


390 Tom Settle 


Mayr proposes that philosophy of science needs to undergo some changes to 
accommodate what is special in organisms, but that we do not need to wait for 
philosophy to do the honours. ‘Biology . . . is creating a world view that is in 
conflict with any physicalist world view that ignores all that is characteristic of 
the world of life and that ignores everything not encountered in the world of 
inanimate objects’ (p. 61). There are echoes here of Alfred North Whitehead’s 
early proposals in philosophy of science, which arose out of criticism of the 
inadequacies of materialist philosophies (Whitehead [1919], [1920]). He 
proposed a ‘philosophy of organism’ (Whitehead [1925]), which sounds just 
the right name for what Mayr has in mind, whatever that ts. It can hardly be 
absolutely clear what Mayr has tn mind since he is speculating about 
developments that are not yet complete. But some features of the philosophy he 
hopes to see emerge are clear. For instance, he is quite against vitalism and he 
is in favour of emergentism and downward causation. So am I. So is Popper, as 
far as I know. And so, I dare say, was Whitehead. If there is some metaphysical 
space between physicalism and vitalism, ts it fully occupied by interactionism 
or is there room for (dare I call it?) ‘organismism’? This is a big question and I 
shall do no more than hint at the answer I incline to favour, which, in brief, is 
that one may embrace realism without being a physicalist at the same time as 
one thinks the physical universe causally open without being an interactio- 
nist. 


2 VAN ROOIJEN'S CRITIQUE OF POPPER'S INTERACTIONISM 


Van Rooijen reads Popper as arguing from the two premises that ‘mental 
experiences have emerged during evolution’ and that ‘Darwinism explains the 
emergence of things only if they make a difference', to the conclusion 'that 
mental experlences must have an influence on the physical world: the physical 
world is not causally closed' (p. 88). He has two criticisms of this argument, 
which I shall consider in a moment. But first, I should say that I do not think 
this is one of Popper's arguments, though I do not mean that Popper denies 
either of the premises or the conclusion . On the contrary, I think he holds to all 
three statements, only I do not think he argues from the first two to the third. It 
is possible that Van Rooijen has confused a number of different arguments of 
Popper's, all of which are presented in the work he cites: Popper and Eccles 
[1977]. One is to the effect that the universe is not causally closed; a second is 
to the effect that the emergence of mind can be explained by evolution theory. 
A third argument to the same double effect is implied in a discussion of how 
physicalism cannot do justice to the higher functions of language. Then there 
are two, not quite independent, arguments aimed at showing the inadequacies 
of epiphenomenalism and of the identity theory as solutions to the mind-brain 
problem. A reader might perhaps be excused for getting them all a bit confused 
because they are intertwined in presentation. I shall discuss them more fully in 


Van Rooijen and Mayr versus Popper 391 


Section 3, where I hope to show fairly exactly where in Popper's complex line 
of argument Van Rooijen’s criticisms can be made to bite. 

Van Rooijen's first criticism is that ‘the introduction of interacttonism in 
ethology would not only be in contradiction with statements of both founders 
of ethology, but also with the history of biology and even with that of natural 
science as a whole’ (p. 88). I think this point is probably quite right on both 
counts. As to the first count, probably neither Tinbergen nor Lorenz would 
endorse the introduction of interactionism into ethology. But then, I think, 
neither would Popper. His interactionism ts not meant, as far as read it, to bea 
contribution to ethology, but to philosophy. Even so, it could hardly be decisive 
against introducing interactionism that neither founder favoured it. The 
second count is the more important. Interactionism may be in contradiction 
with the history of science. 

The point Van Rooijen seems to want to make here is that neither physics 
nor biology nor ethology made much progress until non-physical entities were 
no longer accepted as causations, and I think he wants us to infer that it would 
be a bad thing to change all that now. I completely agree with his historical 
hypothesis but I have serious reservations about what work he wants that 
hypothesis to do. Of course, it could not imply that science should not change 
its methodology or assumptions or that there should not be any change as to 
what kind of thing can count as a cause. In fact, on what can count asa cause © 
sclence has changed its mind quite a few times. I do not think Popper ts 
proposing that science should embrace interactionism, though he certainly is 
proposing that we should not think science rules it out. Here I agree with 
Popper. I do not think Van Rooigen's historical hypothesis can be turned into 
an argument against there being non-physical causations. 

Putting this last point another way round, we can say that the historical 
hypothesis does not support the thesis that the existence of non-physical 
causations contradicts science. Of course, the claim that there are non- 
physical causations contradicts physicalism, but physicalism is not implied by 
science. Some people seem to find inductive arguments trresistible. They would 
probably find physicalism irresistible, since it is supported by all the inductive 
arguments, whatever they are, which support science. But finding inductive 
arguments irresistible is out of harmony with science, whose history seems to 
tell an heroic story of people respecting inductive arguments—as evidenced by 
the refusal to belleve what lacks their support—and yet, at the same time, 
firmly resisting complete seduction—as evidence by willingness to declare 
seemingly well supported theories false, or at least less like the truth than their 
replacements. For my part, I can resist inductive arguments, and I recommend . 
everyone's doing so on such matters as whether the universe is causally closed. 
Arguments based on scientific experience do not look as if they could be 
decisive here, not just because of any power induction may lack but principally 
because scientific practice begs this very question. 


392 Tom Settle 


To be sure, empirical arguments once bade fair to close even this issue. That 
was in the days when all the laws scientists knew were deterministic and when 
probability was taken to be a matter of ignorance. But those days are long 
gone. There is not a single deterministic theory left in science, save the ones 
that deal with mass effects or the ones that do not involve much precision. Go 
for the precise behaviour of single physical systems and the appropriate law 
will be statistical or stochastic or probabilistic. This makes a considerable 
difference to what we mean by causation and by the universe being causally 
closed, as I shall argue in a moment. Right now my point is simply that 
science’s success is not an adequate argument for the view which ts the 
metaphysical shadow of science's method. I go along with the suggestion 
Popper made half a century ago to distinguish between metaphysical 
hypotheses and what he called ‘rules of method’ (Popper [1935], [1959]. I 
notice wide disagreement among scientists on all the interesting metaphysical 
problems. But science does not wait to resolve these issues. It does not wait 
even to resolve the issue positivists pressed as to whether any metaphysical 
hypothesis makes sense. As long as there is general agreement on rules of 
method—what I prefer to call ‘the constraints on science’s task'—science can 
proceed. 

I applaud science’s parsimony in eschewing explanation by non-physical 
causes, especially explanation by putative agents (minds, selves, demons, 
gods). Given that agents, for all their sometime consistency of character, are 
apt to act unpredictably or creatively or whimsically, search for unvarying 
regularities will be frustrated if explanation by whimsy is allowed. But this 
parsimony, adopted as a matter of method in pursuit of a laudable end, does 
not imply that agents are not causes. It merely implies that where agents are 
causes science will try to tell a different story. Now if science were to pull off the 
deterministic explanatory programme upon which it embarked centuries ago, 
that would probably completely undermine the theory that agents are causes 
since science’s story would leave absolutely no gap for them to poke thetr 
fingers in. In contrast to the sincere religious zeal of early scientists, many later 
ones would no doubt endorse Laplace's joy at not needing interventionist 
hypotheses. All this despite the fact that we have absolutely no direct 
acquaintance with any causes other than those due to our own agency. All 
other ascription of causation ts speculative, as Hume correctly said, though he 
was only half right in saying we never perceive causes: he forgot propriocep- 
tion. But that explanatory programme seems to have fizzled. The story science 
currently tells about what, for ordinary people (and a few philosophers), is 
straightforwardly a case of an agent being the cause of something is full of 
causal holes. On a close look at the description science currently gives as its 
best explanation of so-called agent-causation, I am impressed by how unclosed 
the physical universe looks. One needs to be a little careful here about what is 


Van Rooijen and Mayr versus Popper 393 


being meant by ‘cause’, but I shall postpone my attempt to clarify this term to 
Section 4. 

Van Rooijen’s second criticism of the argument he ascribes to Popper 
amounts to a rejection of the second premise, that Darwinism explains the 
emergence only of things that make a difference. His point, quite rightly, is that 
Darwinism also explains the emergence of things that are neutral or even 
detrimental, providing that they piggyback genetically upon things that make 
a positive difference, though the explanation here is much weaker, much more 
like what I should call ‘permission’, than is the explanation of the differential 
survival across generations of phenotypic characters which bestow a com- 
parative advantage in reproduction upon their carriers. (What J mean by 
introducing the notion of permission in this. Statistical theories can never say 
what precisely will happen next. They can only say what is permitted to 
happen next, and what are the probabilities of particular permitted things 
happening—though some statistical theories in biology dealing with the 
direction of evolution do not even permit the computation of probabilities, as I 
argue in my [1982b]. But permission is not unimportant. Statistical theories 
rule a great deal out.) And Van Rooijen is quite right that what he rejects is a 
view of Popper's. It is explicit on page 88 of Popper's [1977], which Van 
Rooijen cites, and also on page 74, which he does not. However, Van Rooijen is 
mistaken to think that correcting this mistake of Popper's ruins one of his 
arguments for interactionism or for the causal openness of the universe. This 
particular premise does not appear in Popper's case for these doctrines, though 
he does use it as a premise in arguments aimed at other conclusions. 


3 WHAT POPPER ACTUALLY ARGUES 


In this section, I shall paraphrase the arguments, relevant to Van Rooijen's 
critique, which Popper actually uses in the first three chapters of the work Van 
Rooijen cites (Popper and Eccles [1977]), to show how Van Rooljen's second 
criticism does not bite where he thinks it does but does bite somewhere else. 
The first two arguments are run together. Of these the first, to the effect that the 
universe is not causally closed, was actually presented by Popper in a lecture in 
1965 (Popper [1966], now Ch. 6 of his [1972]), but it has been elaborated as a 
result of Popper's developing, in lectures in 1967 and 1968 (Chapters 3 and 4 
of [1972]), his theories concerning what he calls World 1 (the purely physical 
world), World 2 (the world of mental events), and World 3 (roughly, the world 
of ideas—problems, solutions, arguments, designs, conventions, symphonies, 
plays). As it appears in [1977], it is intertwined with the second argument, to 
the effect that evolution theory can explain the emergence of mind. The 
combined argument goes something like this: 


(a) We accept things as real if they can causally act upon ordinary real 
material things. (pp. 9-10). 


394 Tom Settle 


(b) Matter transcends itself in the process of evolution when there emerge 
minds, and human language. (p. 11) 

(c) This can be understood better in the light of ‘organic evolution’ (the 

view that changes in the habits of some members of a species may 

precede changes in physical structure) and ‘downward causation’ (the 

view that wholes may act, as such, on their parts or on sue of a 

different order of size). (pp. 11-21) 

In evolution there is emergence of genuine novelty, quite ünpredictable 

from what went before. Indeterminism (the theory that some physical 

propensities of things are probabilistic) explains how this is possible. (pp. 

22-7) 

Among the emergents are consciousness, human language, and the 

human brain (whose physical development is partly the result of the 

growing use of language, which was obviously adaptive). (pp. 27-31) 

(f) Different levels of emergence interact, especially World 2 interacts with 
Worlds 1 and 3. Unembodied World 3 objects are real, since their 
existence makes noticeable physical differences to World 1, when they 
are grasped and used by people. (The example used here is of how 
toothache and the institution of dentistry affect our movements— 
phoning for an appointment, etc.) (pp. 32-47). (A second example, 
given later, is Russell’s discovery of an inconsistency in the foundation 
of Frege’s Grundgesetze, without seeing the work, and the problem of 
explaining all that was happening to Frege when he deduced the 
existence of the inconsistency from a letter of Russell’s about a different 
manuscript. (Pages 56-7)) 

(g) World 3 objects cannot be reduced to World 2 objects, nor to brain states 
or dispositions (even if World 2 objects could be so reduced). (p. 57). 


From (a), (e), (f) and (g) there follows: 


(h) the physical universe (World 1) is not causally closed (p. 57). (The other 
premises play the role of helping one to understand the later premises 
when they are reached.) 


(d 


— 


(e 


~~ 


(From (a), (b), (c), (d), (e) and (f), coupled with the uncontentious 
assumption that Darwinism explains the emergence of what makes a 
difference, there follows: 


(j) Darwinism explains the emergence of mind. (pp. 72-3) 


The offending references to the assumption Val Rooijen rejects do not appear 
in these arguments at all. They appear immediately afterwards, when Popper 
is engaged in showing the inadequacies, as he sees them, in epiphenomenalism 
and in idehtity theory as solutions to the mind-brain problem. I share Popper's 
view that both epiphenomenalism and identity theory are inadequate, but I 


Van Rooijen and Mayr versus Popper 395 


am afraid the arguments for this are slightly weaker than Popper seems, in 
[1977], to suppose, and what weakens them is just the mistake Van Rooijen 
fingers. On page 73, in preparation for these arguments, Popper explains what 
he means by ‘the Darwinian view’, with which, he is going to argue, both 
epiphenomenalism and identity theory clash. Neither of these theories, Popper 
argues, can explain the emergence of subjective experiences in the way 
Darwinism can, because neither of them can allow causal effectiveness to non- 
physical entities. But the third principle in the four that summarize the 
Dawinian view does not look quite right. It is this: ‘If natural selection is to 
account for the emergence of the World 2 of subjective or mental experiences, 
the theory must explain the manner in which evolution of World 2 (and of 
World 3) systematically provides us with instruments for survival’. 

Why must it? There are two obvious possibilities Popper may have had in 
mind: he may have thought that Darwinism could only explain features that 
were adaptive (the mistake Van Rooijen suggests he makes) or he may have 
meant, merely, to insist upon the facticity of World 2's systematic provision of 
instruments of survival. I do not pretend to know, though I strongly suspect he 
meant the latter, since he repeats this point both on page 74 and on page 88 
immediately after his mention of Darwinism. But suppose we deny that World 
2 does act so beneficially towards us. Suppose we say instead that all adaptive 
virtue resides in the genes that programme the development of the brain (if 
some do), and that consciousness and language, and all those thoughts which 
we know and love, are inactive by-products of the hidden processes of the 
brain. Would Darwinism be in ruins? Would we be involved in some 
contradiction? I do not think so. But I do think that we should be involved in a 
rather grand implausibility, as has been very well argued by Hans Jonas in the 
appendix of his [1984]. It is the implausibility of the impotence of subjectivity 
which counts against epiphenomenalism and identity theory. At the present 
time science shows no signs of being able to explain how mind, etc., could get to 
bulk so large in our lives if they were not at all advantageous. Perhaps this is 
what Popper means when he argues, as he does on page 74, that 
epiphenomenalism cannot explain, in Darwinian terms, the evolution of 
World 2. Perhaps he means that any attempted explanation in Darwinian 
terms lands epiphenomenalism in implausibility. But that is not what he seems 
to be saying, though it is perhaps what he should be saying. And what he 
seems to be saying is mistaken. The same applies to the parallel argument 
against identity theory (pp. 86-8). Van Rooijen's second criticism bites here. 


4 MAYR ON WHETHER BIOLOGY TRANSCENDS PHYSICALISM 


The question I ask in this section is not whether one could construct a doctrine 
that might not misleadingly be called ‘physicalism’ and that could embrace all 
that is special about living organisms—even people opposed to what 


396 Tom Settle 


physicalism is currently taken to be might want to endorse such a doctrine. 
The question ts, rather, whether whatever results from the development of a 
philosophy of science that answers to Mayr’s requirements would be 
recognizably the physicalism some people currently oppose. Put in other 
words: would some of the currently objectionable components of physicalism 
have to go to make room for an adequate philosophical treatment of 
organisms? I think the answer is 'Yes!', and I shall be specific about what has to 
go. 
What does Mayr say? If one contraposes physicalism simply to vitalism, then 
Mayr is a physicalist, because he wishes not to give up the principle ‘that all 
processes in living organisms are consistent with the laws of physics and 
chemistry’ ([1985], p. 53). Let's agree with that. But if physicalism implies an 
‘extreme reductionism’, he thinks biology refutes it (p. 58). But what makes 
reductionism extreme? Certainly not the programmatic aim to explain as 
much ofa thing as possible by examination ofits parts. Nobody challenges that 
who applauds sclence. Àn extreme reductionism would be one that denied the 
emergence of real novelty, the hierarchic organization of living things, and 
downward causation. I should like to try looking at each of these in turn, but 
they are so intertwined, it will be difficult. 

Itis uncontentious and even old hat to say that wholes bave properties their 
parts do not have, and vice versa. So any historical emergence of any novel 
whole will, one expects, be accompanied by the emergence of novel properties. 
What does Mayr want to assert which extreme reductionists might want to 
deny? Principally, that not all the emergent qualities of systems at a higher 
hierarchic level will be explicable from the qualities of their parts, or more 
generally from the laws of lower levels. Notice, Mayr completely accepts what 
he calls 'constitutive reductionism', which I take to mean that there is no new 
stuff, no new dynamic principle, added as life moves from a lower to a higher 
level. (I am inclined to agree with this, though I almost certainly disagree with 
Mayr's assessment of what is there in the first place.) What he rejects is what 
we might call ‘nomic reductionism’, the doctrine that laws governing one level 
are reducible to laws governing a lower level. 

But there would be something vacuous about the laws at the higher level— 
they would not be causal laws—if they did not make any difference to life at the 
lower levels. If the entire causal story about some thing’s parts could be told in 
terms of the causal laws of the parts' level, then the display of as many 
emergent qualities as you might care to discern in the life of the whole would be 
purely epiphenomenal. What makes interesting the kind of emergent qualities 
Mayr says biology introduces is not Just that without them we could not 
predict certain properties of wholes, but that without them we would neither 
predict nor explatn certain properties at lower levels. In other words, what 
makes emergentism interesting is coupling it with downward causation. 

This is the moment to clarify the term 'causation' a little. It is not easy to do 


Van Rooijen and Mayr versus Popper 397 


this since cause is such a fundamental concept that it is not likely to be 
elucidated by displaying some other, supposedly more fundamental, concepts 
from which it is derived. One attempt at elucidation, based upon Hume's claim 
that all we can detect of causes is their constant conjunction with their effects, 
is to explicate cause in terms of laws: causal explanation becomes nothing 
other than the subsumption of the thing to be explained under some law. But 
this is unsatisfactory because some laws are merely kinematic, whereas a 
causal law should be dynamic, it should exhibit the source of the power to 
bring about the effect. Of course, all ascription of causal power to anything is 
conjectural and can never be proved conclusively, though it may be put pretty 
well beyond doubt and can often be disproved, or at least shown to be 
inconsistent with other things we want to believe. 

If we assume the spectator model of the scientist, the one Hume assumed, 
then we shall be in trouble trying to vindicate even the conjectural ascription 
of such a power to anything, because there is nothing in the vivid display of 
secondary qualities immediately present to the spectator to warrant such a 
concept. Whitehead pointed out, in his criticism of Hume, that Hume did not 
think through thoroughly his own point that we see with our eyes, hear with 
our ears, etc. We are not merely aware, Whitehead said, of the items in 
presentational immediacy; we know also something about causal efficacy 
(Whitehead [1929], especially Part II, Chs. V and VIII]. As he puts it, ‘The 
notion of causation arose because mankind lives amid experiences in the mode 
of causal efficacy’ (Whitehead [1929], p. 175 in the corrected edition [1978 ]). 
But what we know is not limited to what we know in perception: we know 
about causal efficacy also from exertion, from making an effort to bend or bite 
some object or move it or hold it still. Blology has taught us that all.animal 
learning is active, not passive. This active ingredient is missing from accounts 
of scientific learning on the (standard) spectator model. But scientists see what 
they see in their laboratories because they have prepared their apparatus in a 
very particular way, and they could not understand what they see if they left 
out of the reckoning their own preparatory work. (A recent welcome change of 
perspective, summed up in the slogan ‘If you can spray them, they're real’, 
comes in Ian Hacking's [1983].) 

A second way to attempt elucidation is to identify the (dynamic) cause of 
something with the source of its energy, but the relation between being the 
cause of, and being the source of energy for, an effect needs to be loose, if 
downward causation is to be rendered plausible, stnce all the energy available 
to any whole can be parcelled out without remainder to its parts, either as 
belonging to them intrinsically or as being binding energy that keeps some 
parts connected to other parts. In fact, the energy equations are the business of 
physics or of physical chemistry, and if biology is to introduce any real 
causation into the world's story, we shall have to decouple causal explanation 
from explanation in terms of energy transfer. This can be done provided we can 


398 Tom Settle 


assume that determinism is false. There could be no such thing as downward 
causation if anything like Laplace’s or Einstein's determinism were true, since 
all the world’s causal story could be told in telling the tale of its tiniest parts. 
Mayr thinks determinism is false, and so do I. The implication is that the 
activities and properties of wholes will be underdetermined by the activities and 
properties—and energy transactions—of their parts. Although indeterminism 
does not imply downward causation, it leaves room for it. Room is left to 
conjecture that some states or processes of wholes control, in some measure, 
the fates of their parts, or, more generally, some processes at a higher level 
influence outcomes for elements at a lower level. Popper discusses this very 
question in the work Van Rooijen cites, in an attempt to show that 
considerations of the conservation of energy do not rule out the kind of mind- 
brain liaison he and Eccles propose (Popper and Eccles [1977], pp. 541, 542, 
545). Science does not give us a smooth casual story of how some carefully 
monitored quanta of energy go about the business of being what really 
happens when someone’s decision appears to be the cause of some physical 
happening. 

When Donald Campbell introduced the expression ‘downward causation’, 
somewhat apologetically, in his [1974], he restricted his sense of it to the 
operation of laws: ‘all processes at the lower levels of a hierarchy are restrained 
by and act in conformity to laws of the higher levels’ (p. 180). He had in mind, 
particularly, the laws of natural selection. But it is not quite right to talk about 
processes being restrained by laws. Processes are restrained by things or by 
other processes not by laws, which describe without compelling. For there to 
be downward causation, it must be some higher level entity or process which 
restrains the lower process. (For a discussion, with many examples, see Arne F. 
Petersen [1983].) Now, not all the stories biologists tell of natural selection at 
work display selection’s dynamism. To do this they would have to say how 
some whole things impinge upon other whole things, the way the molecular 
theory of gases exhibits the dynamism hidden within the concept of pressure. 

Popper does not restrict downward causation to laws. He allows minds and 
ideas to be causes, though without going into much detail about the dynamics 
of their action. Nor does Whitehead, who, some time before Campbell coined 
the name for it, described downward causation explicitly in his discussion of 
‘The Order of Nature’ and ‘Organisms and ‘Environment’ ((1929], Part II, 
Chs. III and IV). Mayr cannot restrict downward causation in Campbell's way, 
either, if he is to be consistent about the importance of uniqueness of 
organisms, and the comparative unimportance of laws, in biology. Causal 
effects downwards, for Mayr, have to be the work of organisms, considered as 
wholes, rather than of laws, considered as restraints. With this I am in full 
agreement. But I think Mayr and I need a new concept, here. I think we need 
the concept of spontaneity. 

The actions of wholes which are interestingly effective downwards must be 


Van Rooijen and Mayr versus Popper 399 


underdetermined by antecedent causes, especially from below. There is no 
question of a violation of energy conservation. But energy comes in small 
bundles and the laws describing their behaviour and predicting their future are 
statistical. Absolute precision in forecasting how energy will redistribute itself 
is not available, and there are many ways small beginnings may be amplified 
over time, even quite quickly, as in trigger reactions. Biology repeatedly 
testifies to the impossibility of forecast, even of rough forecast. For example, no 
one can forecast either the pace or the direction of evolution. As to what fills 
the gap left in the causal story by underdetermination, there seem to be three 
distinct choices: chance, spontaneity, and outside interference. Chance seems 
no cause at all, just the ignorance of it. Spontaneity is very familiar in inner 
experience. It need not imply non-physical causation, just autonomous 
organisms: it is a separate question whether spontaneity has a subjective 
dynamic, as I think it does, though I need not here press Mayr to agree. About 
outside interference I shall make no comment here. 

Obviously, if spontaneity is to be let in as part of the causal story, there will 
need to be some shift in common modes of thought both within and without 
science. It looks rather too much like the explanation by agency which Van 
Rootjen rightly reminds us science is trying to do without, for spontaneity to be 
a mode of explanation within science, unless science changes its spots. Is 
biology asking sclence to change Its spots in introducing downward causation 
by organisms? I do not know, but it is certainly asking physicalism to change 
some of its doctrines. 

There are two many variants of physicalism for it to be easy pin it down in 
summary form, but I can identify three objectionable doctrines which I 
associate with physicalism in one form or another, and which are under 
challenge by Mayr's pressure for a new philosophy more hospitable to biology. 
Here they are: 


(1) Inertness of matter. The idea that matter is tnert has probably been 
dropped by modern physicalists already, though it was a central 
doctrine of old-fashioned materialism. It needs to go if spontaneity is to 
be allowed. 

(2) Exclusiveness of upward causation. 

(3) Exhaustiveness of explanation of wholes in terms of interactions of their 
parts. 


(2) and (3) together constitute what one might call the machine theory of 
wholes. We know how to make machines; and in making them we aim to 
control the machine’s behaviour by what parts we use. Organisms, in my 
view, are not machines, despite there being much we can learn about them by 
pretending that they are. 

And I can identify one more doctrine, which I think is at the heart of 
physicalism, a doctrine rejected by Jonas, Popper, Whitehead and myself, and 


400 Tom Settle 


which I think has to go. I do not know how Mayr stands on this one, though I 
suspect he endorses it: 

(4) Causal monopoly of objective physical entities. I think physicalists 
generally hold that only physical entities can be causes and only 
physical processes can be causal processes. This implies a ‘sort of 
metaphysical behaviourism, whose implausibility I have already men- 
tioned. I mean the theory that either things do not have tnner lives or, if 
they do, these inner lives are causally ineffective. 


5 HOW TO REJECT PHYSICALISM WITHOUT BEING AN 
INTERACTIONIST 


The key question here is what the most fundamental entities in the universe 
(or at least in the epoch of the universe in which we participate) are. Common 
sense, and most thinkers tn the Western tradition, until recent times, would 
allow two kinds: bodies and minds. It has never been solved to everybody’s 
satisfaction how these two different kinds of things interact. Descartes, with his 
own definitions as to what distinguished body from mind, showed the problem 
up more sharply than usual, and offered an unsatisfactory solution. Recent 
materialists or physicalists try to tell the world’s story without reference to 
mind (except as an inexplicable epiphenomenon), and there is nothing wrong 
with making such an attempt. Another version of my key question is what to 
make of the story that results from this attempt: is it the whole story? 

Popper, it should be stressed, does not want to answer the question about 
what is fundamental ({1977], p. 2) and thinks we do not know how mind and 
body interact (p. 153). He is thus somewhat at cross purposes with physicalists 
who do want to say what is fundamental—physical bodies and physical 
processes—and who repudiate minds as active, at least partly because there 
seems no glimmer of hope for saying how they could be. Popper contents 
himself with conjecturing the active reality of three different kind of things, the 
third being ideas, and with presenting the evidence for their reality and 
interaction. 

Van Rooijen, at least in his [1987], does not assert that the universe ts 
causally closed, only that Popper's argument for its openness does not hold, 
and that science had better not indulge in explanations based upon 
subjectivity. As I have suggested in my discussion of Mayr's views, I am not 
sure science can hold out against such types of explanation, if it is to become 
fully hospitable to biology, though I think physics can and should (I do not side 
with Eugene Wigner [1967] on how to interpret quantum theory). But we 
must be interactionists? 

Interactionism—and I am discussing Popper's version since it is the best I 
know of—strikes me as similar to scientific theories in this interesting respect: 
it is conjectural, explanatory, keeps faith with evidence, is true as an 


Van Rooljen and Mayr versus Popper 401 


approximation, but oversimplifies and is probably strictly false; with more 
work we may find out how to replace it with something better. All this is in its 
favour, as I argued, against Mario Bunge's physicalism [1977], [1980], in my 
[1982a]. What I hold against physicalism from the start ts not the boldness of 
its explanatory programme—this I applaud—but the arrogance of its dismissal 
of the facts of experience. I include inner experience. In the end, all 
philosophies and all sciences must make their peace with inner experience; it is 
all we have unmediated access to. (Of course, I do not mean introspection. The 
inner experience I mean arises from extrospection.) 

Three considerations keep me from endorsing interactionism more fully. 
One is the implausibility of the suggestion that World 3 objects have a dynamic 
effect upon World 2 objects—though their existence is necessary for thought; 
things may be contributary causes without being dynamic, as is the case for 
social institutions which collapse when people's support for them is with- 
drawn. A second is how fragmented the mind is when one considers only 
consciousness, and how unsatisfactory theories of the unconscious all are. The 
third, and most important, is my deep distaste for the bifurcation of myself into 
real, parallel (though interacting) parts. This is not how I experience myself, 
nor, I fancy, how anybody else (or any animal) experiences himself or herself, 
as I have argued at some length in my [1982a]. Mind has all the air of a 
theoretical construct, abstracted from some intriguing, but fleeting, character- 
istics of reality. 

An alternative to interactionism, which goes deeper and tries to say what in 
the universe is more fundamental than body or mind, is available in rough and 
ready form in the philosophy of organism offered by Whitehead. His 
arguments against materialism are subtle but powerful. His examination of 
what exactly is presented in experience improves enormously upon the 
classical treatments. One need not agree with him as to how to develop his 
early work—many people are put off by the theology explicit in his later 
work—to appreciate the attractiveness of the alternative he offers to 
physicalism. Four points merit attention here: 


(1) The most fundamental entities in the universe, which are also the 
ultimate facts for sense awareness, are events (which are inherently 
processes). 

(2) Material objects are complexes of events, which take their identity from 
genetically caused repetitions of dominant characteristics, and are 
properly seen as abstractions from their underlying nexus. 

(3) Space and time and the laws of the adventures of material objects in : 
space-time are likewise abstractions from the reality they correctly. but 
approximately, report. 

(4) Mind, likewise, is a theoretical abstraction, this time from the subjective 
rather than the objective components of events. (The components of 


402 Tom Settle 


events are not themselves entities out of which events are made, and the 
subjectivity of lower level events is negligible.) 


My primary aim in mentioning Whitehead’s philosophy is not to promote it, 
though I must say I very much favour the view that our concepts of both body 
and mind abstract from a more subtle and unified underlying reality, but to 
demonstrate the possibility of a philosophical viewpoint which avoids what is 
objectionable about physicalism without falling prey to what is objectionable 
about interactionism and without losing faith with sclence—the jury has not 
yet been sent out on whether Whithead’s philosophy has contributions to 
make to relativity theory or to quantum theory. (For recent work see Robert 
Russell [1987], regarding relativity, and Henry Stapp [1977], regarding 
quantum theory.) 

As to whether the universe is causally closed, the viewpoint of organismism 
suggests that the scientific concept of the physical universe is an approxima- 
tely true abstraction from, an approximately true or homomorphic model of, 
reality and that the causal account given within its terms will not tell all the 
story of the dynamics of reality, some space being left for other abstractions, 
other models, to point to dynamic elements of reality. In this sense, rather than 
in the sense of the physical universe being a fundamentally real world with 
holes in it, the physical universe is not causally closed. 

How does evolution theory connect with all this? It is obvious that evolution 
theory assumes and explains the uniqueness and diversity of organisms and 
that it presupposes a hierarchy of levels. Could it imply causal openness 
without implying interactionism, except perhaps as a first approximation? I 
think it can. For one thing, underdeterminism, which is explicit in evolution 
theory since all its laws are statistical, implies causal openness of the kind I 
have just described, though without implying (or forbidding) non-physical 
causation. Secondly, evolution theory also implies downward causation, 
which, even if it is taken to include the dynamism of subjectivity, need not 
include interactionism because of the existence of non-interactionist alterna- 
tives. 

I say evolution theory implies downward causation because without it 
natural selection could not be considered dynamic: although evolution ts said 
to occur with changes in gene frequency in the gene pool of a population and 
this may occur through drift, there is no doubt that it is the phenotype and not 
the genotype that the environment acts upon in natural selection. Downward 
causation is also needed to explain the emphasis upon organisms as wholes, 
without which evolution theory would be vacuous. Predators see prey as 
wholes, and vice versa, and, whether it be stalking or fleeing, an organism uses 
downward causation te coordinate itself for the task. But I do not think 
evolution theory itself takes us further than the assertion of integrated and 
autonomous organisms. It is hospitable to, but does not require, subjectivity's 


Van Rooijen and Mayr versus Popper 403 


being causally effective, since evolution theory has the resources to explain the 
emergence and development of subjectivity, as well as it explains anything, if 
dynamic subjectivity is treated as factual. However, an argument that links 
evolution theory to dynamic subjectivity works just as well for Whitehead’s 
version of this, or for Jonas’s somewhat different, but still integrated, view 
([1966], [1984]), as it does for Popper's interactionism. Therefore, I think we 
can say that evolution theory does afford grounds for distinguishing between 
causal openness and interactionism, since it favours the one without 
favouring the other. 


TOM SETTLE 
University of Guelph 
REFERENCES 


BuNcE, Mario [1977]: ‘Emergence and the Mind’, Neuroscience, 2, pp. 501-9. 

Bunce, Mario [1980]: The Mind-body Problem, Pergamon Press. 

CAMPBELL, DoNALD T. [1974]: '"Downward Causation” in Hierarchically Organized 
Biological Systems’, in F.J. Ayala and T. Dobzhansky (eds), Studies in the Philosophy 
of Biology, University of California Press, pp. 179-186. 

Hacxine, IAN [1983]: Representing and Intervening, Cambridge University Press. 

Jonas, HANS [1966]: The Phenomenon of Life, University of Chicago Press. 

JoNAs, HANS [1984]: The Imperative of Responsibility, University of Chicago Press. 

Mayr, ERNST [1985]: ‘How Biology Differs from the Physical Sciences’, in D. J. Depew 
and B. H. Weber (eds), Evolution at a Crossroads, The M.I.T. Press. 

PETERSEN, ARNE F. [1983]: ‘On Downward Causation in Biological and Behavioral 
Systems’, History and Philosophy of the Life Sciences, 5, pp. 69-86. 

PoPPER, Sm Karr R. [1935]: Logik der Forschung, Springer Verlag. 

Porrer, SR Karr R. [1959]: Logic of Scientific Discovery, Hutchinson. 

Popper, Sm KARL R. [1966]: Of Clouds and Clocks, Washington University Press. 

Popper, SIR KARL R. [1972]: Objective Knowledge, Oxford University Press. 

Popper, Sir Karl R. and Sm Jonn C. Eccuzs [1977]: The Self and Its Brain, Springer 
International. 

RussELL, RosERT J. [1987]: ‘Whitehead, Einstein and the Newtonian Legacy’, MS, 
avatlable through the Center for Process Studies, Claremont, Cal. 

SETTLE, Tom [1982a]: ‘Letter to Mario: The Self and Its Mind’, in J. Agassi and R. S. 
Cohen (eds), Scientific Philosophy Today, Reidel, pp. 357-79. 

SzrrLE, Tom [1982b]: ‘Indeterminism Undermines Science’, Fundamenta Scientiae, 3, 
pp. 103-12. 

Stapp, Henry P. [1979]: ‘Whiteheadian Approach to Quantum Theory and the 
Generalized Bell's Theorem', Foundations of Physics, 9, pp. 1-25. 

VAN ROOIJEN, JEROEN [1987]: "Interactionism and Evolution: a Critique of Popper’, 
British Journal for the Philosophy of Sclence, 38, pp. 87-92. 

WHITEHEAD, ALFRED N. [1919]: An Enquiry Concerning the Principles of Natural 
Knowledge, Cambridge University Press. 

WHITEHEAD, ALFRED N. [1920]: The Concept of Nature, Cambridge University Press. 

WRITEHEAD, ALFRED N. [1925]: Science and the Modern World, Macmillan. 

WHITEHEAD, ALFRED N. [1929]: Process and Reality, Macmillan. 

WHITEHEAD, ALFRED N. [1978]: Process and Reality, Corrected Edition, D. R. Griffin, and 
D. W. Sherburne (eds), Free Press. 

WIGNER, EUGENE [1967]: Symmetries and Reflections, The M.LT. Press. 


Brit. J. Phil. Scl. 40 (1989), 404-408 Printed in Great Britain 


DISCUSSION 


Epiphenomenalism and Machines: 
A Discussion of Van Rooijen's 
Critique of Popper 


1 Epiphenomenalism and philosophy 
2 Machines 


I EPIPHENOMENALISM AND PHILOSOPHY 


Some claims of Popper (Popper and Eccles [1977]) about the humans, the 
machines, interactionism and epiphenomenalism are discussed by Van 
Rooijen [1987]. I would like to discuss some aspects of Van Rooijen's critique. 
If the objective approach to science is applied to the study of behaviour, then 
even if epiphenomenal qualities exist, they cannot be scientifically proven. As 
Van Rooljen said, they belong to the subjective sphere and the subjective 
sphere is not observable to anybody else except to the subject to whom it 
belongs. So, one can subjectively observe only one's own subjective sphere. 
Thus, epiphenomenal qualities, as subjective qualities, are not accessible to 
objective scientific research and to no more than one person. Here, some 
classical questions can be stated. How can Van Rooijen conclude that other 
people also have epiphenomenal qualities only on the assumption that he has 
them and that other people are physically similar to him? That they are 
physically similar to him can be proved by scientific, objective, method because 
the physical structures of human beings can be studied by more than one 
person and can be observed by exterior senses, but epiphenomenality cannot 
be observed by more than one person and is not accessible to the exterior 
senses of others, so it is not accessible to sclentiflc methods and cannot be 
scientifically proven. From that other people are physically similar to Van 
Rooijen, it does not follow that they are also epiphenomenally similar to him, 
ie. that if they are physically similar to him, they are also similar in non- 
physical respects. Why should it be that if the objective, physical sphere of 
beings are similar then the subjective spheres must be also similar? Maybe 
some beings that are physically similar to each other have a subjective sphere 
and the others of the same species do not. Scientifically, this claim has the same 


Epiphenomenalism and Machines 405 


value as the claim that all people have epiphenomenal qualities. They are the 
same because neither can be scientifically proven since sclence cannot observe 
‘these subjective epiphenomenal qualities. Objective science can always find 
only physical or physiological properties which are, according to Van Rooijen, 
accessible to exterior senses and to observation of more than one observer. 
Interactionism as a theory has that advantage over epiphenomenalism, that it 
has the property of falsifiability. According to interactionism, mental non- 
physical events, if they exist, can causally influence the physical world. So, we 
must find cases where we have a complete physical or physiological 
description of events in the brain, nervous system and the behaviour triggered, 
but where these descriptions are not enough to explain that behaviour which 
is triggered. If we were not able to derive the triggered behaviour from these 
physical or physiological events only, then we would have to add also a mental 
non-physical component as a cause that completes the causal conditions 
which brought about the observed behaviour. However, if we can explain the 
events in the brain, nervous system and the behaviour with just physical and 
physiological explanations, then postulating an epiphenomenal subjective 
sphere is an unnecessary metaphysical addition. 


2 MACHINES 


Van Rooljen also offers a defence of the thesis that epiphenomenalism does not 
imply that men are machines. What in fact is a machine is unclear from Van 
Rooijen's article. In his own words (Van Rooijen [1987] p. 90): 


The physical similarity between me and a configuration of the matter that we call 
a machine is very superficial compared with all the similarities between me and 
other human beings (and the higher animals). Therefore there are not as many 
reasons to assume that machines have a psychological dimension, as there are 
reasons to make this assumption in relation with higher organisms. This seems 
to bea good reason to keep the distinction between men (and the higher animals) 
and machines. 


But it does not seem so to everybody. I think that we can make several 
objections to the paragraph cited. First, what is the criterion that makes some 
configurations of matter a machine? We can also ask what is the criterion for 
differentiation and distinguishing configurations of matter which are 
machines from those configurations which are not? Van Roojen does not offer 
any criteria. Similarity between things of the same type of the same 
configuration of matter cannot be such a criterion. It presupposes that we 
already know a criterion for distinguishing configurations of the matter which 
are machines from those which are not, and then, for every other thing which 
is similar enough to the thing for which we apply the criterion, one can 
conclude that, because of that similarity, it is or is not a machine. 


406 Davor Peénjak 


Let’s assume that TV-set is a machine. But the configuration of the matter of 
the TV-set 1s so different from the configuration of the matter of the steam- 
engine of a steam ship, or in other words, the similarity between the steam- 
engine and TV-set is very small, if it exists at all. Then, according to Van 
Rooijen’s theory, the steam-engine would not count as a machine. 

I am not prepared to give here the necessary and sufficient conditions for 
what ts a machine, but we say that some device is a machine on the notton of 
how it functions, how it operates, how it changes its states, how it 
interchanges matter and data with environment, what program it operates 
and so on. The configuration of the matter of which the device is composed 
rarely plays a role in determining whether it is a machine or not. If we have an 
electronic device and a mechanic device (so, it means that the configuration of 
the matter of which they are composed is totally dissimilar) which both show 
the time and do only that, shall we call them both clocks, and hence machines? 
Yes, we shall call them both clocks, but on the notion that they instantiate the 
same programs, that their operational aims are defined in the same way, that 
they instantiate the same functions and so on. The configuration of the matter 
would be totally unimportant. There is not only one configuration of matter 
that can be called a machine, and that everything that is dissimilar to it is not a 
machine. From the dissimilarity of the TV-set and the steam-engine we can 
conclude that, let's say, the steam-engine is not a TV-set, but not that it is not a 
machine. So, the steam-engine, refrigerator, TV-set, formula one V-8 engine 
can all be called machines despite the fact that they differ in configuration of 
matter very much. If what I have said is true, then why should we use the 
notion of configuration of matter in deciding whether or not humans are 
machines? We do not have much reason to do it! 

In fact, it seems that Van Rooljen a prior! and tacitly assumes that he is not a 
machine. But it is given without any convincing argument. Then it is 
extrapolated to every other human being who is similar to him in configu- 
ration of matter, and hence every human being is not a machine. This cannot 
be a valid argument. The answer is given before the question is put. 

Van Rooijen would be in an even worse position if successful functionalistic 
computational programs can be given for describing and explaining human 
psychology and behaviour. (But what follows can raise difficulties for Popper 
also.) In general, functionalistic computational programs do not depend on 
how they will be realized. The same program can be realized by various 
different physical devices. This means that very different physical devices can 
be built for realizing the same programe, and that physical devices could have 
very different configurations of matter. They do not have to be similar at all. 
Moreover, functionalistic programs do not even depend on physical realizabi- 
lity; in fact, they do not depend on any particular realization. They can be 
realized by non-physical things also. 

If certain functionalistic programs were to be successful in describing 


Epiphenomenalism and Machines 407 


human psychology and mind, it would follow that that program can be 
realized by various other devices including physical ones, such as computers. 
But we call computers machines. If the same program can be performed by 
computers and human beings, why shouldn’t we say that human beings are 
machines also? It is said that programs can be realized by non-physical things 
also. So even if the human mind fs not physical, tt can nevertheless be a 
machine. 

If we assume that the physical world is causally closed, then if we want the 
human mind not to be determined, it must be non-physical. But even if the 
human mind is not physical, it does not follow automatically that it is not 
determined. There can also be cases of determinism and determination in non- 
physical worlds and spheres. Why cannot possible worlds exist in which non- 
physical events, things and entities are also subordinated to lawful and causal 
relations, but, of course, not to physical lawful and causal relations? If 
something is not physical it does not follow immediately that in such kind of a 
world things and events are not determined. There exists a possibility of totally 
determined non-physical worlds if there exists a possibility of non-determined 
non-physical worlds. If the human mind is not physical, this is not enough to 
establish the non-determination of mind. It must be shown that the human 
mind is a non-physical entity which does not belong to a deterministic non- 
physical world, or part of the world if world 1s mixed, and that it is not, at least 
fully, determined by the physical. In fact, I introduced a possibility of mixed 
worlds which consists of physical and non-physical entities. Our world could 
possibly be of this kind. But we must distinguish various kinds of non-physical 
things and entities. If abstract entities like mathematical entities, e.g. numbers, 
have independent existence, their existence is non-physical, but it is not the 
same kind as the non-physical existence of human mind. In connection with 
causal influence, there can be various causal bounds in mixed worlds: 


(a) Physical can causally influence the non-physical but not vice versa. 
(b) Physical can causally influence the non-physical and vice versa. 
(c) Non-physical can causally influence the physical and not vice versa. 


We can also vary the degree of causal influence and determination. 
Interactionism suits well in (b) kind of worlds. But if it is true that the degree of 
causal determination can vary, then we can have a totally determined 
interactionistic world. For example, the physical event f; causally and 
inevitably leads to mental event (which 1s non-physical) m; m, causally leads to 
physical event fz; f; to fs; fs to fa; fa to mz; m2 to m; and so on. Every event can 
causally follow from previous events no matter if they are physical or non- 
physical. (For interactionist, I think, there remains the question of what to do 
with conservation laws, but at least, the logical possibility of such a world 
exists.) If interactionist wants non-deterministic interactionism, he or she 


Brit. J. Phil. Sci. 40 (1989), 409-412 Printed in Great Britain 


DISCUSSION 
On the Origin of Spin in Relativity* 


There has been some recent discussion in this Journal on the question of the 
_ origin of spin in physics, as to whether it is indeed a consequence of relativity 

theory alone (Peschke [1988], Morrison [1986]. I should like to add a few 
comments to this discussion, starting with the remark that this question was 
indeed settled by Einstein and Mayer [1932]. What they did was to investigate 
the algebra implicit in the theory of relativity, irrespective of any particular law 
it may be applied to. 

Einstein’s and Mayer’s investigation was to see what would be the most 
primitive (i.e. irreducible) form of the representations of the Poincaré group— 
that is, the algebraic symmetry group that underlies (the algebraic part of the 
logic of the) theory of special relativity. What they found was that as soon as 
one removes the space-time reflection transformations from the four-dimen- 
sional real representations of the Lorentz group (the group that prescribes how 
the components of a four-vector transform, from one inertial frame to any 
other, in accordance with special relativity) these representations reduce to the 
direct sum of two two-dimensional complex (and Hermitian) representations. 
This reduction then yields the irreducible representations of the symmetry 
group for the theory of special relativity per se, since this theory prescribes the 
covariance of the laws of nature only with respect to the continuous space- 
time transformations between inertial reference frames in which one com- 
pares the forms of these laws, in accordance with Einstein’s principle of 
relativity. 

The discovery of Einstein and Mayer was then that the basis functions of the 
irreducible, two-dimensional representations of the Poincaré group are the 
two-component spinor variables. Thus, it follows that the most primitive types 
of field variables with which to represent the laws of nature, if they are to be in 
accordance with the principle of special relativity, are the two-component 
spinor variables. The appearance of ‘spin’ in the laws of nature is then a 
consequence of the theory of relativity alone. 

Of course, it is true that it was Pauli who originally introduced the spin 
matrices into the nonrelativistic Schródinger equation in wave mechanics. 
However, this was not more than an ad hoc insertion, designed to reproduce 


*I thank the faculty of the Department of Physics, University of Leeds, for their kind hospitality 
durtng a visit in 1988, when this note was prepared. 


410 Mendel Sachs 


the empirically verified extra degrees of freedom, in yielding for example the 
extra spectral lines of hydrogenic atoms, observed in the anomalous Zeeman 
effect. However, Pauli's insertion did not explain the spin variables. The 
explanation did not come until Dirac discovered that the Klein-Gordon 
(second-order) differential equation—covariant under reflections in space and 
time as well as the continuous space-time transformations of special 
relativity—factorizes into a pair of coupled (first-order) differential equations, in 
terms of the spinor variables, when one removes the reflection symmetry 
elements from the underlying group of relativity theory. 
Recalling this sequence of steps, the Schródinger prescription, 


E—ihO/Ot, p —1hV, 


in wave mechanics, yields the Klein- Gordon equation from the energy- 
momentum. relation for a free particle tn special relativity. Allowing this 
so-constructed operator equation to operate on the matter wave $ we then 
have 


(B=p'c? + mc!) =0 (1) 


(1+2?)¢=0 (2) 


where (= (02/0 — V?) and 4?=m?c*/h*. 
The factorization of the Klein-Gordon equation (2) 


0" 0,n — — Ax (3a) 


aô x= — An, (3b) 


in turn, yields the ‘reflected’ two-component spinor equations (3ab), in terms 
of the two-component spinor variable y and its reflected spinor y= «n*, where 
&— (9!) and the asterisk denotes complex conjugation. In eq. (3a), the basis 
elements of the first-order differential operator are o“==(0°;o*), where o? is 
the unit two-dimensional matrix and o* are the three Paull spin matrices 
(k- 1, 2, 3). 

The operator o"0, behaves algebraically like a quaternion, and its space- 
reflected operator, 0"0, is its quaternion conjugate, where 5 = (0°; — o*). From 
‘the commutation properties of the Pauli matrices and the unit matrix it then 
follows that substitution of eq. (3b) into eq. (3a) ylelds the Klein-Gordon 
equation (2): 


(8^0,) (c"0,)n ^ Um = — ^g. 


[The coupled spinor fleld equations (3a) and (3b) could, equivalently, be re- 
expressed in terms of time-reflected (rather than space-reflected) quaternion 
operators.] 


On the Origin of Spin in Relativity 411 


The relativistic covariance of the spinor equations (3ab) implies that 
invariants of this formalism in special relativity are ņty and its reflection, 
ytn-scalar functions of the space-time coordinates that are neither even nor 
odd with respect to reflections. However, they can always be re-expressed as 
the sum of an even part and an odd part, I, that is even (scalar) and Ij, that is 
odd (pseudoscalar). That is, 


mta sorta xin) 01x — xtv) lu Ips. 


What Dirac did, in effect, in his original formulation of relativistic wave 
mechanics, was to recover a formalism that is only even with respect to 
reflections (as with the Klein-Gordon equation), while maintaining the spin 
degrees of freedom. He did this by going from the two-component spinor 
equations (3) to the (more restrictive) four-component bispinor basis function, 


e+ Zz 
g E 

The relativistic invariant of the Dirac bispinor formalism, in turn, is 
proportional to I, (above) while there is no counterpart for Ips. This bispinor 
formalism 1s the usually referred to Dirac expression of wave mechanics, for 
spin-one-half fields, such as the matter field variables for an electron or a 
proton. Nevertheless, it is the pair of two-component spinor equations (3) 
(usually identified with Majorana’s name) that is irreducible in respect to 
special relativity alone. Its origin demonstrates the most general expression of 
the energy-momentum relation in wave mechanics for spin-one-half, massive 
flelds, compatible with the symmetry requirements of the theory of special 
relativity, necessarily in terms of a two-component spinor formalism. 

However, the main point of this note is the idea that the expression of wave 
mechanics is only a special case of the more general result discovered by 
Einstein and Mayer—that the irreducible representations of the symmetry 
group of special relativity theory imply that the spinor variable is the most 
primitive way to express laws of nature that are compatible with Einstein's 
principle of special relativity. This result should then apply to all of the physical 
laws—from wave mechanics in the elementary particle domain to all of the 
other laws, up to the physics of the universe as a whole—that of cosmology. 

In conclusion, it should be noted that when one extends the symmetry 
group from that of special relativity to that of general relativity, the 
geometrical logic of theories of matter (in any domain) changes, but the 
algebraic logic does not. Thus in general relativity one still has laws of matter, 
most primitively, in terms of spinor and quaternion variables, though in this 
case they are mapped in a curved space-time, governed by the rules of 
Riemannian geometry, rather than the rules of Euclidean geometry, as in 


412 Mendel Sachs 


special relativity theory. The latter global extension 1s discussed in detail in 
Sachs [1986]. 

MENDEL SACHS 

Department of Physics and Astronomy 

State University of New York at Buffalo 


REFERENCES 


EINSTEIN, A. and Mayer, W. [1932]: Preuss. Akad. Wiss. Phys. Math. Klass. Sitz., p. 522. 

Morrison, M. [1986]: British Journal for the Philosophy of Science, 37, p.101. 

Sacus, M. [1986]: Quantum Mechanics from General Relativity, Chapter 4. Reidel 
Publishing Co., Dordrecht. 

Von Pzscuxs, J. [1988]: British Journal for the Philosophy of Science, 38, p. 566. 


Brit J. Phil. Sci. 40 (1989), 413-417 Printed in Great Britain 


DISCUSSION 


A Comment on 
Maxwell’s Resolution of the Wave/ 
Particle Dilemma 


The purpose of this note is to give a theoretical physicist's assessment of the 
recent article on wavefunction reduction! by Nicholas Maxwell [1988]. Apart 
from its somewhat unfamiliar terminology most of Maxwell's work is in 
agreement with the 'orthodoxy' accepted by the majority of physicists. He, 
rightly in my opinion, is seeking a realistic interpretation of quantum theory. 
He regards the ‘wavefunction’ or 'state-vector' as a real, existing entity and 
follows the orthodox tradition (now more than 60 years old) by arguing that, 
in addition to its deterministic evolution according to the Schródinger 
equation, the wavefunction must, under some circumstances, and in a 
probabilistic way, undergo 'reduction'. (For a brief introduction to other, non- 
orthodox, possibilities see Squires [1986].) Apart from the vague requirement 
that such reduction is associated with ‘measurement’ or ‘observation’ 
no convincing account of this reduction has ever been given. Experimentally 
all that we know for certain is that in some circumstances it does not 
happen: 

Most of the previously suggested mechanisms for wavefunction reduction 
rely on the fact that measurements involve extremely complex systems 
containing large numbers of degrees of freedom, so, to some extent, the 
proposals take refuge in ignorance. Maxwell's idea has, at least apparently, the 
great merit of betng applicable to a simple system. Roughly speaking he claims 
that wavefunction reduction occurs when there is the possibility of increasing 
the number of particles. 

In order to try to make this more precise we restrict ourselves here to non- 
relativistic quantum theory. Then the number of elementary particles is in fact 
fixed. However, a bound state of two particles, say, can be broken into two 


1 Tt is a sad fact that physicists and philosophers of science generally read mutually exclusive sets of 

en when considering the same topic—so I am grateful to Nicholas Maxwell for 

sending me a copy af his article, and to the British Journal for the Philosophy of Sctence for allowing 
me to intrude into its pages. 


414 Euan J. Squires 


independent constituents. To keep things simple, we imagine a particle A 
bound to a fixed (i.e. infinitely heavy) particle O; then any state of the system 
can be expressed in the form 


W(A) = Y'a V (A) (1) 
k 
where the a, are complex numbers and the yy are energy eigenstates 
H Vk — E V 
E being the eigenvalues. (2) 


The energy of the bound states are discrete: Eo (the ground state energy) 
« E; « E;, etc. The non-bound states, however, lie in a continuum, starting 
above the highest bound state energy. For such states the summation in (1) 
becomes an integral but we adopt the convention of allowing the summation 
sign to include an integral where necessary. 

Now we suppose that in addition to (O and A) we have a particle B which 
interacts with A. At some time, t —O, B is in a wavepacket a large distance from 
(OA), and the (OA) system is in its ground state. Such a situation is described by 
the wavefunction 

Pr=0 (A, B) - Vo( A) xofB, t—0). (3) 


We now let this system evolve according to the Schródinger equation. If B ts 
moving towards (OA) then eventually the interaction between A and B begins 
to be felt and the wavefunction becomes 


YA, B) = Ye (09, (AB, t. (4) 


Note that we can write V, in this way, ie. as an expansion in energy 
eigenstates of the (OA) system, because such states are a 'complete set'. There 
are many other ways of writing ‘¥,—just as a vector in 3-space can be written 
in terms of different sets of base vectors—and the expansion in (4) is chosen 
because it is appropriate to our purpose. 

Now, whereas (3) contains 2 'particles' B and (OA), (4) in general contains 
terms with 3 'particles': A, B and O. Thus, according to Maxwell's suggestion, 
it signals wavefunction reduction. We must, however, immediately qualify 
such a remark because, in fact, a form like that in (3) only holds exactly at 
particular times (if at all). In general, for example because interactions are 
actually of infinite range, the true wavefunction will always contain contribu- 
tions with k corresponding to unbound states. The presence of such terms 
cannot therefore automatically signal immediate wavefunction reduction. 
(This would certainly violate experimental evidence.) We have several options, 
e.g. 


(a) when one, or more, ofthe |c], corresponding to an unbound state becomes 


Maxwell's Resolution of the Wave/Particle Dilemma 415 


bigger than some specifled value (in the range [O, 1]) reduction 
immediately occurs, or 

(b) reduction occurs instantaneously, but with a probability which is an 
increasing function of the |c| for the unbound states, or 

(c) reduction occurs not instantaneously but with a rate that is an increasing 
function of the |c,| for the unbound states. 


Itis already clear that, even in the very simple situation we are considering 
here, there 1s considerable 'vagueness'. This is not a criticism of the general 
idea; rather it is intended to be an invitation for it to be made more precise. In 
orthodox quantum theory the time dependence of the wavefunction is given 
by Schródinger's equation. Presumably the idea here is that there are 
additional terms in this equation, which will have to contain a 'random' input, 
and which are responstble for the required reduction. Suggestions along these 
lines have in fact already been made, notably in the very elegant work of Pearle 
[1983]. It would be interesting to see whether the general ideas of Maxwell 
could be formulated in a similar type of equation. 

Since in physics changes usually occur in a continuous fashion rather than 
suddenly, we shall follow the choice (c) above. We might also consider it 
reasonable to modify the original suggestion by treating all excited states 
(k #0) in the same way, i.e. not distinguishing between bound and unbound 
states. (This is consistent with Maxwell's comment that it is the change in 'rest 
mass’ that is relevant.) Then, if we recall that X|cy|?—1, we see that the 
simplest measure of the excited states is the quantity 1—|c,|?. We could 
therefore postulate the rate of wavefunction reduction to be equal to 
A[1 — |co|?] where 4 is a constant. In fact 4, which would have the dimensions 
of (time) !, would normally be expected to depend on the process considered. 

What else can we say about the value of 4? A natural suggestion is that it 
should be similar to the time scale of the interaction, i.e. the time during which 
B is in the neighbourhood of (OA). To justify this we need to recall the reason 
why wavefunction reduction is introduced into quantum theory. Suppose the 
(OA) system is used as a measuring device to determine whether B actually 
passes in the vicinity of (OA), rather than in some different direction. The 
measurement is completed by the time B has ceased to interact with (OA) and 
we sould therefore expect that the ‘decision’, i.e. the reduction, should have 
happened at essentially the same time. 

Maxwell in fact suggests a different expression: 

k h 

At~ AE 
for the time scale of the reduction, where (AE) is the ‘excitation energy’, e.g. the 
binding energy of particle A. It is easy to see that this is the same order of 
magnitude as the time taken for A to move a distance equal to the (OA) radius. 


416 Euan J. Squires 


This is in contrast to the previous suggestion which was the time for B to move 
a similar distance. 

We can obtain an upper limit to å if we note that too rapid reduction will 
violate the established success of quantum theory which depends on the time 
evolution being given by the Schródinger equation. Perhaps the best test here 
is in the context of atomic scattering processeses, e.g. electron-hydrogen 
scattering which is very similar to our simple B— (OA) interaction. All 
calculations which go beyond the first order (‘Born’) approximation involve 
excited intermediate states, so the proposed collapse mechanism would 
seriously alter the results of such calculations and would affect their successful 
and detailed comparison with experiment (Bransden [1983]). We would 
expect the effects to be significant unless the time scale of reduction is much 
larger than the interaction time. This would seem to invalidate the first 
criterion given above, and similarly the second one would give incorrect 
results for slow electrons. 

If a specific model incorporating Maxwell's suggestion could be found, then 
clearly it would be of interest to calculate its effect on electron-hydrogen 
scattering amplitudes, and so put some experimental limits of the rate of 
reduction. 

Maxwell suggests an additional dependence on the initial velocity of particle 
B (or the electron in the particular example), by asserting that the reduction 
only occurs when the energy is sufficient to give inelastic scattering, Le. to 
produce real as distinct from virtual excited states. This would eliminate the 
effect for sufficiently slow incident particles and, in particular, would prevent it 
occurring for bound state situations (e.g. the Helium atom) where accurate 
calculations are much easier. Again it would be interesting to attempt a proper 
formulation of this condition which would presumably involve the Hamilto- 
nian operator rather than the energy, since the latter is undeflned for general 
quantum states. 

Further problems arise if we try to formulate the above ideas in more general 
situations. We are seeking a model where reduction occurs to states in which 
particular subsystems are in eigenstates of the appropriate part of the 
Hamiltonian operator. However, we need to have some method of choosing 
which particular subsystems to consider or, more likely, of calculating the 
relative effectiveness of the reduction mechanism to competing subsystems. 
Presumably this is possible; until it is done we do not have a theory. 


Summary 


Maxwell's article endeavours to fill a gap in orthodox quantum theory by 
proposing a criterion for wavefunction reduction. Several versions of this 
criterion are given but all need to be formulated more precisely so that their 
consistency, or otherwise, with present experimental results which agree with 


Maxwell's Resolution of the Wave/Particle Dilemma 417 


standard calculations can be checked. Crude estimates, using one particular 
model, certainly suggest the possibility of conflict. 

Until a more precise formulation of the suggestion is given it remains an 
interesting but vague possibility—one of several (e.g. Bussey [1987]) in the 
‘fuzzy’ region between quantum theory and classical measurement—which 
cannot yet be regarded as a ‘resolution of the wave/particle dilemma’. 

Maxwell of course partially recognizes this problem, c.f. for example his 
remark ‘I am advocating a new research programme rather than a new 
version of quantum theory.' However, it is unreasonable to claim that the 
research programme ts new—the search for convincing methods of reducing 
wavefunction is almost as old as quantum theory, and is fraught with 
difficulties. The fact that no such methods exist at the present time suggests 
caution in advocating a particular one until it has been properly formulated, 
and may in fact be pointing to the conclusion that the whole Idea is erroneous. 


EUAN J. SQUIRES 

Department of Mathematical Sciences 
Science Laboratories 

University of Durham 


REFERENCES 


BRANDSEN, B. H. [1983]: Atomic Collision Theory, 2nd Edition. Benjamin. 

Bussey, P. J. [1987]: "The Fate of Schródinger's Cat’, Physics Letters, A120, p. 51. 

' MAXWELL, N. [1988]: ‘Quantum Propensiton Theory: A Testable Resolution of the 
Wave/Particle Dilemma', British Journal for the Philosophy of Sclence, 39, p. 1. 

PEARLE, P. [1983]: ‘Experimental tests of dynamical state vector reduction’, Physics 
Review, D29, p. 235. 

Soumes, F. J. [1986]: The Mystery of the Quantum World. Adam-Hilger. 


Brit. J. Phil. Sci. 40 (1989), 419-427 Printed in Great Britain 


REVIEW 


SIEGEL, H. [1987]: 
Relativism Refuted: A Critique of 
Contemporary Epistemological Relativism. 
Synthese Library/Volume 189. D. Reidel Publishing Company. ix+210 pp. 


ROBERT NOLA 
University of Auckland 


In the Theaetetus (at 171d) Socrates makes the sly suggestion that relativists 
never stay around long enough to debate the cogency of their doctrines. After 
having argued that Protagoras' relativism is self-refuting, Socrates imagines 
that the deceased Protagoras might reappear, his head popping out of the 
ground accusing Socrates and his companions of talking nonsense—and then 
immediately disappear back into the ground again. Subterranean relativists 
are still with us, burrowing away at ever new varieties of relativism. Their 
heads spring up in even greater profusion to proclaim relativism yet again. But, 
unlike Protagoras, the new, hardier varieties of relativist seem to take longer to 
disappear even though they are equally unperturbed by the (largely correct) 
criticism that has been directed at them. Philosophy of science has, 
surprisingly, for the last thirty years or so provided a fertile breeding ground for 
relativists—especially the off-shoot which combines historical and social 
studies of science with an appropriate philosophy of science. The older 
Protagorean variety of relativist who relativized perceptions and/or truth may 
still be found. Burgeoning alongside are those who would relativize not just 
truth but also the ontology of science, the knowledge claims of science (from 
observation reports to theoretical claims) and the canons of scientific method 
(Le., the rules whereby scientific claims are assessed). Cross-fertilization has 
produced a number of sub-varieties of relativism. Each of truth, ontology, 
knowledge and method have been variously relativized to time, or place, or 
society, or class, or conceptual framework (theory, paradigm), or whatever. 
Though not one to underestimate the wily relativist, Socrates might be 
surprised at the large number of relativist heads dotting our contemporary 
intellectual landscape. 

Just as no single kind of weed-killer will eliminate all weeds in one's garden, 
so no single argument will deal with all varieties of relativism one might 
encounter (and will not deal with those relativists who, when treated to a 
seemingly fatal dose of argument, retreat underground only to pop up their 
heads again elsewhere). Siegel offers us some arguments which deal (despite 


420 Review 


the book's title) with only some relativist doctrines. As he acknowledges 
(p. 167), he omits discussion of forms of Wittgensteinian relativism, Win- 
chian relativism in social science, the Edinburgh strong programme and 
varieties of incommensurability, amongst others. The book is largely a 
negative critique of selected aspects of relativism. In Part I, Siegel avails himself 
of reformulated arguments from Plato’s Theaetetus with which to attack 
mainly epistemological relativism and, in particular, a version of truth- 
relativism due to Meiland. Part II tackles the book's main target: this is the 
relativism which, he argues, arises out of Thomas Kuhn's highly influential 
The Structure of Scientifle Revolutions, various aspects of which have found 
support in the work of philosophers such as Meiland, Doppelt and Brown. The 
final part of the book begins with a critique of Goodman's version of relativism, 
but ends on a positive note in favour of an absolutist alternative to relativism 
(though only a sketch of this is given). 

All but two of the eight chapters have appeared elsewhere and reappear in 
the book with varying amounts of revision and additional material to give the 
book continuity. The book is clearly written (but occasionally repetitious) and 
its arguments are carefully and clearly set out. Several of Siegel's arguments 
against the relativisms he depicts are familiar, but they are presented in a way 
which is quite telling. The book has little in the way of constructive 
alternatives to the views which are criticized, which may disappoint those who 
are already convinced of the inadequacy of relativism. But it is a welcome 
addition to the arsenal of material to be deployed against relativism in some of 
its many guises. Relativists should read this book; unfortunately they may not, 
or may fail to alter their ways once they have. It may be too much to expect 
that, like the Protagoras of Socrates' striking image, they will disappear 
underground again. That they do not would be a significant sociological fact 
which would stand in need of explanation, part of which may be provided by a 
(non-relativist) sociology and psychology of belief (thus turning the tables on 
relativist sociologists of ‘knowledge’). 

The first half of Plato’s Theaetetus contains a wealth of arguments against 
Protagorean relativism, two of which Siegel singles out for special consider- 
ation. Siegel first formulates what he calls ‘epistemological relativism’ (ER) 
which concerns the alleged relativization of all knowledge claims to rival sets of 
standards for evaluation (rather than Protagoras’ relativizations to individual 
persons): 


ER: For any knowledge-claim p, p can be evaluated only according to one or 
another set of background principles and standards of evaluation 
81, .. ., S[ =S]; and, given a different set (or sets) of background principles 


1 Page references in parentheses are to the volume under review. 


The British Journal for the Philosophy of Science 421 


and standards s;', . . ., Sn [=S], there is no neutral (that is, neutral with 
respect to the two (or more) alternative sets of principles and standards) way 
of choosing between the two (or more) alternative sets in evaluating p with 
respect to truth or rational justification. p's truth and rational justifiability 
are relative to standards used in evaluating p. [p. 6, slightly abbreviated] 


Siegel argues for the incoherence of ER on two grounds which can be briefly 
summarized as follows. Consider what he calls 'the necessarily-some-beliefs- 
are-false (NSBF) argument’. If we grant that ER is true then it follows that p is 
true (justiflable) relative to S while p is not true (not justifiable) relative to S', 
and that there is no neutral way to adjudicate between these two claims to 
discover whether or not p is true (justifiable). This is something relativists 
allege about rival sets of rules of scientific method. Granted this, what follows? 
As an instance of p choose ER itself. It follows that if ER is true then ER is true 
(justifiable) relative to S but ER is not true (not justifiable) relative to S’. In 
particular, it follows that if ER is true then ER is false (not justifiable) relative to 
S’. That ts, the relativist must admit that if ER is true, then, relative to standards 
of truth and justification S’, ER is false. Thus the NSBF argument—some of our 
beliefs are false, in particular ER itself, if ER is true. 

This does not dispose of ER immediately because, as Siegel recognizes, the 
relativist does have some escape routes. Relativists might exempt ER from 
applying to itself (Siegel blocks this route). Or they might want to relativize 
truth fully and interpret it in some particular way. Thus, the antecedent of the 
condition of the third-to-last sentence of the previous paragraph claims that ER 
is true, in an absolute sense of ‘true’. The truth predicate could be relativized to 
some suitable relativizer, in which case ‘true’ is a three-place relation rather 
than the normal two-place relation (of the classical correspondence theory of 
truth). We will return to this shortly. 

Siegel’s other argument against relativism culled from the Theaetetus 
depends on the fact that relativism undermines the very notion of rightness 
(hence, ‘the UVNR argument’). In summary, the argument is as follows. Either 
the relativist defends ER relativistically or non-relativistically. If the latter, then 
there are good reasons for holding ER, good reasons being those which are not 
arbitrary, or idiosyncratic or non-neutral, and, in particular, are not relativist 
in character. So if there are good reasons for ER’s rightness then there are 
neutral non-relative standards on the basis of which such judgements are 
made. But ER itself says that there are no such neutral non-relative standards. 
So if ER is right then ER is wrong, i.e., ER undermines the very notion of 
rightness. If the relativist defends ER relativistically, then the claim that 
relativism is right must be abandoned in virtue of the very claim that ER itself 
makes about the standards of assessment. A relativistic defence is no defence at 
all because the very idea that there is a well-grounded defence for ER is what 
ER gives up. 


422 Review 


To escape these arguments a relativist might resort to a fully relativized 
notion of truth. But what is this? Through an examination of Meiland’s 
account of ‘ø is true for W’ (where 'e' is a statement and 'W' is a person, or a set 
of leading principles, or world view, or cultural and/or historical situation, or 
any other appropriate relativizer). Siegel shows that either the notion is 
incoherent, or it collapses back into a non-relativist (i.¢., absolutist) account of 
truth or it means no more than W believes that e (in the case where W is a 
person). The arguments are too detailed to repeat here; but they are 
scrupulously fair to Meiland’s position and do show that truth construed as a 
three-place relation between a statement, the world and varjous candidates for 
W, faces serious difficulties. Thus the relativist has no coherent notion of 
relative truth to retreat to in the face of the NSBF and the UVNR arguments. 

However, Siegel ts not yet finished with possible responses by the relativist. 
He envisages that he is wrong and supposes that there is some viable notion of 
relative truth available to the relativist. He then shows that it is a non-sequitur 
to argue from the claim that truth is relative to the claim that the relativized 
truth is itself worthy of belief, or has any epistemic warrant at all bestowed 
upon it. Suppose that pis true relative to some person W. All that this claims ts 
that p is believed by W or that p corresponds to W's conception of reality. 
Nothing follows about p's being more worthy of belief by W than any other 
statement, whether it be not-p or any arbitrarily chosen q. If it is the case that p 
is true for W then it does not follow that W ought to believe that p, or that p is 
justified for W—in fact, the relative truth of p confers no epistemic warrant 
upon p at all. This Stegel stigmatizes as 'the impotence of relativism'. This point 
has not been emphasized greatly in the literature on relativism and is worth 
highlighting. It is not to be evaded by the relativist who claims that p's being 
true for W makes it more worthy of belief by W than some rival q which, while 
relatively true, is true relative to some other person Y. It is incumbent on the 
relativist to Show why pis more worthy than q; but therelativist cannot do this 
because, as ER says, the very notion of worthiness has been abandoned. The 
impotence of truth-relativism to assign epistemic warrant to any proposition 
goes hand in hand with the arbitrariness of the beliefs entertained by the 
relativist. 

Some critics have found Plato's Theaetetus wanting, especially the self- 
refutation argument at 170e-171c. Siegel's considerations on ER provide a 
useful way of understanding what is going on and a supplement to other 
considerations of that passage, especially by Burnyeat.* The upshot is that 
Socrates' arguments against relativism remain as compelling as ever. 

What are the alternatives to relativism? In a brief final chapter Siegel 
considers some candidates. Relativists often fallaciously infer from the 


2 See M. F. Burnyeat [1976]: ‘Protagoras and Self-Refutation in Plato's Theaetetus', The 
Philosophical Review, 85, pp. 172-95. 


The British Journal for the Philosophy of Sctence 423 


rejection of relativism that one must accept some ‘eternally correct’ or 
incorrigible belief system. While these two positions are exclusive of one 
another, they are not exhaustive of the possibilities; non-relativism admits a 
wide range of positions including varteties of absolutism, objectivism, 
fallibilism and pluralism, to mention just a few. Fallibilists require that truth 
claims be evaluated in a non-arbitrary fashion in that they be based on reasons 
which may be fallible but which can provisionally justify the acceptance of the 
truth claims. Fallibility does not entail relativism of the ER variety. Nor does 
pluralism; pluralists do not hold with relativists that there is no ranking of 
truth claims, but they do admit that there may be a diversity of criteria of 
evaluation. Relativists are often impressed by the diversity of approaches 
advocated by pluralists but they mischaracterize that diversity by describing it 
as relativist. 

Part I of Siegel's book is devoted to relativism in the philosophy of science in 
so far as it arises from Kuhn's influential The Structure of Scientific Revolutions 
(SSR). Is Kuhn really a relativist? If so, of what sort? Siegel takes us through 
well-explored Kuhnian territory to display the fact that Kuhnian paradigm 
shifts involve not only changes in our theories but also changes in our 
standards and criteria for evaluating theories. Along with these latter kinds of 
change Siegel argues that Kuhn is committed to the view that there is no 
neutral way of evaluating pairs of standards (or criteria of evaluation) and thus 
that Kuhn is committed to a version of epistemological relativism. In fact, one 
of the many ways the term ‘incommensurable’ is used is to claim that there is 
no neutral way of, or common standard for, comparing theortes. Let us grant 
the clatm that paradigms do carry with them their own internal standards or 
criteria for evaluation of theories within the paradigm. As Scheffler first 
argued, and as Siegel develops the theme, nothing follows about there being no 
external standards or criteria for evaluating paradigms. So what are the 
grounds for claiming that there are no grounds for adjudicating between rival 
paradigms? Siegel takes us through Kuhn’s post-SSR writings, including the 
‘Postscript-1969’ to the second edition of that work, up to the paper 
‘Objectivity, Value Judgement and Theory Choice’ (a lecture delivered in 1973 
and first published in 1977 in The Essential Tension). For those who have 
studied Kuhn both the ground covered and the points made will be fairly 
familiar. They reveal that the original radical claims of SSR are considerably 
modified, but, alas, many of the old appeals, e.g., to incommensurability, are 
made in support of the modified position. This is both confusing and confused, 
as Siegel shows, because the old arguments bolster what seem to be new, less 
radical, positions. 

Is there nothing in Kuhn's position worth redeeming? The impression that 
there is nothing is reinforced by the overall negative view of relativism in the 
book. While this may be appropriate there is the niggling feeling that there 
may be something extractable from Kuhn's position given a more positive non- 


The British Journal for the Philosophy of Sctence 423 


rejection of relativism that one must accept some ‘eternally correct’ or 
incorrigible belief system. While these two positions are exclusive of one 
another, they are not exhaustive of the possibilities; non-relativism admits a 
wide range of positions including varteties of absolutism, objectivism, 
fallibilism and pluralism, to mention just a few. Fallibilists require that truth 
claims be evaluated in a non-arbitrary fashion in that they be based on reasons 
which may be fallible but which can provisionally justify the acceptance of the 
truth claims. Fallibility does not entail relativism of the ER variety. Nor does 
pluralism; pluralists do not hold with relativists that there is no ranking of 
truth claims, but they do admit that there may be a diversity of criteria of 
evaluation. Relativists are often impressed by the diversity of approaches 
advocated by pluralists but they mischaracterize that diversity by describing it 
as relativist. 

Part I of Siegel's book is devoted to relativism in the philosophy of science in 
so far as it arises from Kuhn's influential The Structure of Scientific Revolutions 
(SSR). Is Kuhn really a relativist? If so, of what sort? Siegel takes us through 
well-explored Kuhnian territory to display the fact that Kuhnian paradigm 
shifts involve not only changes in our theories but also changes in our 
standards and criteria for evaluating theories. Along with these latter kinds of 
change Siegel argues that Kuhn is committed to the view that there is no 
neutral way of evaluating pairs of standards (or criteria of evaluation) and thus 
that Kuhn is committed to a version of epistemological relativism. In fact, one 
of the many ways the term ‘incommensurable’ is used is to claim that there is 
no neutral way of, or common standard for, comparing theortes. Let us grant 
the clatm that paradigms do carry with them their own internal standards or 
criteria for evaluation of theories within the paradigm. As Scheffler first 
argued, and as Siegel develops the theme, nothing follows about there being no 
external standards or criteria for evaluating paradigms. So what are the 
grounds for claiming that there are no grounds for adjudicating between rival 
paradigms? Siegel takes us through Kuhn’s post-SSR writings, including the 
‘Postscript-1969’ to the second edition of that work, up to the paper 
‘Objectivity, Value Judgement and Theory Choice’ (a lecture delivered in 1973 
and first published in 1977 in The Essential Tension). For those who have 
studied Kuhn both the ground covered and the points made will be fairly 
familiar. They reveal that the original radical claims of SSR are considerably 
modified, but, alas, many of the old appeals, e.g., to incommensurability, are 
made in support of the modified position. This is both confusing and confused, 
as Siegel shows, because the old arguments bolster what seem to be new, less 
radical, positions. 

Is there nothing in Kuhn's position worth redeeming? The impression that 
there is nothing is reinforced by the overall negative view of relativism in the 
book. While this may be appropriate there is the niggling feeling that there 
may be something extractable from Kuhn's position given a more positive non- 


The British Journal for the Philosophy of Science 425 


verisimilitude. Siegel, along with others, takes Laudan to task for this claim. 
Whatever the case, let us grant Laudan’s view of problem-solving and see 
whether or not it leads to relativism. Rival research traditions, RT; and RT2, 
would be (weakly) commensurable if there is at least one problem P that is 
recognized as the same problem for both RT; and RT;. Both could then be 
compared as to how well, if at all, they solve P. However, a degree of 
incommensurability would set in if proponents of the rivals RT, and RT; 
disagreed over the importance of the one problem P that they had in common. 
(Siegel and others do cite examples from science in which proponents of rival 
traditions differ over the degree of signiflcance some problem has for their 
respective research traditions—we can assume such historical cases here.) 
Since the degree of importance any problem has is determined by features of a 
given research tradition then we may have a situation in which, say, the same 
problem P is given no significance by RT, (and so whether or how well RT; 
solves P is irrelevant) but is rated of very high significance by RT; (and a great 
deal hinges on whether RT; can come up with a satisfactory solution, 
satisfactory, that is, by the lights of RT). Siegel discusses such a possibility (see 
pp. 132-6) and I think Laudan would agree that such a possibility could arise. 
But does this now show that Laudan is committed to some version of 
epistemological relativism? 

Ithink not. All that has been shown so far is that it is possible that there exists 
a pair of rival research traditions for which there is one common problem 
which is given entirely disparate weightings of significance. This at best would 
commit Laudan to a 'modest relativism' for this pair of rival research 
traditions. What would have to be shown is that, necessarily, for any pair of 
rival research traditions there exists at least one common problem which has 
entirely disparate welghtings of significance; this would be sufficient (but not 
necessary) to show that there was radical epistemological relativism as far as 
rival research traditions are concerned. The difference lies in the modality of 
the two claims and the quantifiers; science may well illustrate the first, weaker, 
claim but much more is needed to establish the second, much stronger, claim 
which would then lead to relativism. Siegel makes a similar point against 
Doppelt who espouses what he calls a ‘moderate relativism’—this being the 
view (in a nutshell) that scientific change is often or typically underdetermined 
by good reasons. Note ‘often or typically’, and not ‘always’, ‘must be’, or ‘in 
principle’. This, Siegel argues, is not relativism at all since it lacks all the bite of 
a fully-fledged epistemological relativism. Doppelt wants to have the label 
‘relativist’ for himself, but by Siegel’s argument (see pp. 90-2) he can not have 
It. By the same token Laudan, who does not want the label in Progress and its 
Problems (or elsewhere), has been given it by Stegel. But if Doppelt can not have 
it then Laudan can not have it either. 

Part III contains two chapters, the first on Goodman’s relativism and the 
second, already mentioned, on alternatives to relativism. According to the 


426 Review 


later Goodman there are many different equally true descriptions of the world, 
there are many true or right versions. Does Slegel give us a true description or a 
right version of Goodman and his works? Or is he merely Goodmanmaking 
(perhaps in a way not too different from the way Goodman goes ‘Starmaking’)? 
In Siegel's version there are two doctrines of Goodmanian relativism (GR; and 
GR) each of which fits something Goodman says but only one of which is a 
version of epistemological relativism. GR; is the weaker clalm of the two and 
says that even though there may be several right versions it does not follow 
that all versions are right; there are wrong versions and there are version- 
neutral criteria which discriminate right from wrong versions. GR; does not 
lead to epistemological relativism and reflects what might be more correctly 
described as Goodman's pluralism rather than his relativism. However, 
according to Siegel, Goodman claims that GR; itself is one of several versions, 
i.e., in Ways of Worldmaking Goodman presents a meta-version, a version of 
versions, which is itself rivalled by other meta-versions. Within Goodman's 
own meta-version 'not everything goes'; but what 'does not go' for Goodman's 
meta-version may well ‘go’ for some other meta-version. This leads to GRz— 
the claim that what counts as a right or a wrong version is relative to a meta- 
version, and since there is a range of meta-versions which adjudicate rightness 
and wrongness differently there are no neutral version-independent critería of 
rightness or wrongness. It is GR;, the presence of which Siegel documents in 
the writings of Goodman, which leads to epistemological relativism. 

The pluralism of GR; and the radical relativism of GR? are independent of 
one another. GR, is the weaker view that there can be incompatible right 
versions, not the view that all versions are right. GR; entails that all versions 
are right and is therefore powerless, as Siegel puts it, to discriminate amongst 
versions as to their being right or wrong. Relative to one meta-version a 
particular version can be right, but relative to another meta-version the very 
same version can be wrong. Hence, GR; is impotent to discriminate amongst 
versions and anything goes. Has the real Goodman made the error of 
conflating GR; with GR; as in the version of Goodman Siegel gives us? Or does 
the real Goodman fully embrace GR? In separating Goodman's pluralism from 
his radical relativism Siegel has articulated an objection that strikes a response 
in some versions of Goodman which some readers may have made for 
themselves—though it does not answer all the problems they might find in 
their version. 

It remains to mention chapter 2 on framework relativism. Here Stegel 
discusses Popper's critique of relativism and Davidson's attack on the idea of a 
conceptual scheme. Popper’s heart is in the right place concerning relativism 
but the arguments, such as can be discovered in his paper ‘The Myth of the 
Framework’, are found, wanting by Siegel. This paper is Popper at his most 
preachy and Siegel correctly sets it aside since it fails to provide a good 
argument with which to attack the relativist. Of more significance is 


The British Journal for the Philosophy of Science 427 


Davidson’s ‘On the Very idea of a Conceptual Scheme’. Briefly, Davidson 
presents conceptual scheme relativists with a dilemma: either one conceptual 
scheme is translatable into another or, if untranslatable, we lack any evidence 
for anyone possessing a radically different conceptual scheme—and thus the 
very idea of a conceptual scheme is problematic. While not disagreeing with 
Davidson’s position, Siegel claims that it does not capture all the possibilities of 
relativism. In particular, there may be alternatives which are inter-transla- 
table but are incomparable in the sense that there seems to be no way of 
rationally assessing them with respect to one another, or there seem to be no 
standards available for declaring one alternative better than another. Siegel's 
appeal is not to varieties of relativism that can be discerned within the theory of 
meaning, but to varieties of relativism within epistemology; it is these latter 
varieties that have reared their heads more commonly in the philosophy of 
science. But, just as Davidson’s broader conceptual scheme relativism 
understood as an issue concerning inter-translatability succumbs to incoher- 
ence, so framework relativism understood as an issue of rational comparability 
(of rival but inter-translatable ‘frameworks’) succumbs equally, as Siegel 
shows, to the criticisms of epistemological relativism (ER) already canvassed. 


Brit. J. Phil. Sci. 40 (1989), 428 Printed in Great Britain 


ERRATUM 
Volume 40 * Number 1r * March 1989 


Article: Distant Action in Classical Electromagnetic Theory, 
by Brent Mundy 


Page 41—paragraph 1—should read: 


Section 2 outlines a retarded distant action formulation of modern classical 
electromagnetic theory. The central point is that the mathematical apparatus 
developed within the field theory itself allows for a translation to a distant 
action form, while retaining the same equations of motion for the material 
particles and hence leading to an empirically equivalent theory. The possibility 
of such a translation has been noted by several philosophers but not discussed 
in detail.? Here I summarize the steps involved in the translation and comment 
on certain formal aspects of the resulting distant action theory. 

3 Hesse ([1955], p. 351) says, ‘The point at issue between the Continental school and that of 
Maxwell is partly a question of mathematical convenience, since formulation in terms of either 
action at a distance or field theory can be made to yield results that are confirmed by 


observations. . ..' Nagel [1961], pp. 395-6 makes a similar remark. The translation thesis ts 
mentioned but rejected in Stein [1970], p. 283; Stein's arguments will be discussed below. 


Volume 40 * Number 2 - June 1989 


Review Article: Mr Keynes on Probability, by F. P. Ramsey 


Pages 220/222—Running head 
Author's name should read F. P. Ramsey not D. H. Mellor 


THE BRITISH SOCIETY FOR THE 
PHILOSOPHY OF SCIENCE 


Programme of Meetings 1989-90 


Unless otherwise announced, meetings are held in Room 160 (Vera Anstey Room) 
at The London School of Economics and Political Science (University of London), 
Houghton Street, London WC2 2AE at 5.15pm on the following dates. Tea and biscuits 
will be served at 5.00pm. 


1989 


October 2nd P. M. Williams 
Uncertainty in artificial intelligence 


November 13th Tony Stone and Martin Davies 


Cognitive neuropsychology and the philosophy 
of mind 


December 11th Peter Gibbins 
Quantum logic and its uses 


1990 


January 8th Peter Urbach 
Statistical inference 


February 5th Adam Morton 
Mathematical modelling: 
Why models aren't theories 


March 5th Tony Dale 
The problem of logical consequence 


April 23rd Ian Thompson 
How to imagine quantum reality 


June lith 4.00pm Annual General Meeting 
4.30pm Tea and Biscuits 


Edgar Page 
Opposing perspectives on reproductive medicine 
and technology 


These meetings are open, without charge, to members and to the general public. Details of membership of 
the Society may be obtained from the Honorary Secretary Mr G. Ross, Department of Physics, King's 
College, Strand, London WC2R 2LS. 


The Department of Philosophy, Logic and 
Scientific Method at the London School of 
Economics announces the establishment of two 
new graduate degrees: a one year taught M.Sc. 
in the Philosophy of the Social Sciences, and an 
‘American-style’ Ph.D. programme, with a one 
year course requirement. The M.Sc in Logic and 


Scientific Method continues unchanged. The 
LSE invites enquiries from all interested. 


Please contact: 

The Graduate School at LSE, 
Houghton Street, 

London WC2A 2AE. 





Cambridge 


Now in paperback 

Niels Bohr’s Philosophy of 
Physics 

D. R. MURDOCH 

This clear exposition of Bohr's 
philosophy of physics gives a detailed 
analysis of his arguments for 
complementarity and of the 
interpretation he put on it. The great 
debate with Elnstein Is also thoroughly 
examined. 

‘a worthy addition to Bohrian 
scholarship.' New Sclentist 


£12.95 net Pb O 521 37927 X 
294 pp. 1989 


The Uses of Experiment 


Studies in the Natural Sciences 

Edited by D. GOODING, T. PINCH and 
S. SCHAFFER 

Scholars from a range of disclplines 
present original case studles to examine 
the tools that experimenters use, how 
scientists judge that experiments are 
working and argue from thelr results, 
and how results can be challenged. 
£45.00 net Hc 0 521 33185 4 

£17.50 net Pb 0 521 33768 2 

498 pp. 1989 


Substance, Form and 
Psyche 


An Aristotelian Metaphysics 
MONTGOMERY FURTH 

This book presents a complete 
rethinking of Aristotle’s metaphysical 
theory of material substances, through 
treatment of the conceptions of 
substance and non-substance In 
Categories, de Anima and Metaphysics. 
The main alm of the study is to recreate 
In modem imagination a vivid, intultive 
understanding of the Aristotellan 
concept of material substances. 

£30.00 net O 521 34143 4 

320pp. 1988 


Reality and the Physicist 
Knowledge, Duration and the Quantum 
World 

BERNARD D'ESPAGNAT 

This is an investigation of the nature of 
reality from the viewpoint of a physicist. 
It raises profound questions about the 
relationship between the methods of 
science and the reality these methods 
seek to Investigate. 

£30.00 net Hc O 521 32940 X 

£10.95 net Pb 0 521 33846 8 

280 pp. 1989 


Cosmic Problems 


Essays on Greek and Roman Phllosophy 
of Nature 

DAVID FURLEY 

Representing the scholarly Infrastructure 
of Professor Furley's The Greek 
Cosmologists, these essays tackle the 
questions current in ancient cosmology 
and the clash between the two opposing 
systems known as Aristotellanism and 
Atomism. 

£27.50net 0 521 33330 X 

272pp. 1989 


The Empire of Chance 
How Probability Changed Science and 
Everyday Life 

GERD GIGERENZER et al. 

This book tells how quantitative ideas of 
chance have transformed the natural 
and social sclences, as well as daily fife, 
over the past three centurles. 

£30.00 net 0 521 33115 3 

357pp. 1989 

Ideas in Context 

Published with the support of Exxon 
Education Foundation 


To request a copy of our new History and 
Philosophy of Science leaflet please write to 
Jacqueline Arthurs at the address below. 


EB Cambridge 
ap University Bee 
The Edinburgh Building, Cambridge CB2 2RU 





INQUIRY 


An Interdisciplinary Journal of Philosophy 
EDITOR: ALASTAIR HANNAY 





Contents of Vol. 32, No. 1, 1989: 


SYMPOSIUM ON Ted Honderich’s A Theory of Determinism 

Galen Strawson: Consciousness, Free Will, and the Unimportance of Determinism. 
Jaegwon Kim: Honderich on Mental Events and Psychoneural Laws. 

Barbara Hannan and Keith Lebrer: Compatibilism, Determinism, and the Identity Theory. 
Richard Schacht: Whither Determinism? On Humean Beings, Human Beings, and 
Originators. 

Christopber Hookway: The Epicurean Argument: Determinism and Scepticism. 


Jennifer Hornsby: Reasoned Choice (Alan Donagan: Choice: The Essential Element in 
Human Action, and Michael E. Bratman: Intention, Plans, and Practical Reason. 
Geoffrey Madell: Physicalism and the Content of Thought (Lynne Rudder Baker: Saving 


Belief: A Critique of Pbysicalism) 
Books Received 
INQUIRY is published quarterly by 
UNIVERSITETSFORLAGET 
(Norwegian University Press). 


P.O. Box 2959 Tayen, 0608 Oslo 6, Norway, or U.S. Office: Publications Expediting Inc., 
200 Meacham Ave., Elmont, NY 11003, USA. 


Please enter my subscription to INQUIRY 
(4 issues per year) ; 
Rates 1989 (postage included —airmailed NAME: 
to subscribers in the Americas): 


ADDRESS: 
Nordic countries only: 
O Institutions NOK 480,- 
CL] Individuals NOK 250,- 
All other countries: 
C] Institutions USD 87,- C] Cheque enclosed 





L [1 Individuals USD 45,- L] Please send invoice 8] 


——— kL ————— — ee ——X — — o —À Áo —— o 


University of London 


CHAIR OF PHILOSOPHY 
OF NATURAL SCIENCE 
TENABLE AT 

KING'S COLLEGE LONDON 


The Senate invite applications for the above Chair, tenable in the 
research-oriented Department of the History and Philosophy of Science at 
King’s College London. 

The successful applicant will have produced outstanding work in the 
philosophy of modern natural science or mathematics, and will be expected 
to provide academic leadership. 

Applications (10 copies) should be submitted to the Teachers’ Section (POS), 
University of London, Malet Street, London WC1E 7HU, from whom further 
particulars should first be obtained. 

The closing date for receipt of applications is 6 November 1989. 


$ Teachin 
(i) Philosophy 


Call For Papers 
Teaching Modern Philosophy 


$250 Prize For The Best Article 

Teaching Philosophy is devoting a special issue to Modern Philosophy 
and is soliciting papers on the following topics: 
€ The relevance of Modem Philosophy today, or of specific authors ' 

(Descartes, Leibniz, Hume, et al.). 

Teaching the Modems in new guises and other courses. 

Novel ways to teach the concepts of Modem Philosophy. 

Novel assignments and course designs. 

Special problems in teaching Modem Philosophy, etc. 
The deadline for submissions is September 1, 1989. Please send 3 copies 
of your manuscript to Teaching Philosophy, Arnold Wilson, Editor, 
University of Cincinnati, Cincinnati, Ohio 45221-0206, USA. 


























Synthese 


An International Journal for Epistemology, 
Methodology and Philosophy of Science 


Editor-in-Chlef 


Jaakko Hintikka, Dept. of Philosophy, Florida 
State University, Tallahassee, USA 


Synthese publishes articles in all the flelds covered 
by the subtitle. These include the theory of knowl- 
edge; the general methodological problems of 
science, such as the problems of scientific dis- 
ek ul and scientific inference, of induction and 
probability, of causation and of the role of mathe- 
matics, statistics and logic in science; the method- 
ological and foundational problems of the different 
departmental sciences, in so far as they have 
philosophical interest; those aspects of symbolic 
logic and of the foundations of mathematics which 
are relevant to the philosophy and methodology of 
sclence; and those facets of the history and 
sociology of sclence which are important for 
contemporary topical pursults. Special attention is 

ald to the role of mathematical, logical and 
inguistic methods in the general methodology of 
science and in the foundations of the different 
oe be they physical, biological, behavioural 
or social. 


Subscription Information ISSN 0039-7857 
1989, Volumes 78—81 (12 issues) 

Institutional rate: Dfl. 952.00/US$466.00 Incl. p&h 
Private rate: Dfi. 368.00/U8$176.00 incl. p&h 


Private subscriptions should be sent direct to the publishers 
Back volume Information ts available upon request 


PO Box 322, 3300 AH Dordrecht, The Netherlands 


P.O Box 358, Accord Station, Hingham, MA 02018-0358, U.S.A. 


hm 





KLUWER 
ACADEMIC 
PUBLISHERS 


K 











J~ 


Philosophical Studies 


An international Journal for Philosophy in the 
Analytic Tradition 


Editor-In-Chief 
Keith Lehrer, University of Arizona 


Assoclate Editor 
John Pollock, University of Arizona 


Philosophical! Studies was founded in 1950 by 
Herbert Feigl and Wilfrid Sellars to provide a 
periodical dedicated to work in analytic philosophy. 
The journal is devoted to the quick publication of 
analytical contributions, particularly (but not exclu- 
sively) In epistemology, philosophical logic, the 
philosophy of language, and ethics. Papers ap- 
plying formal techniques to philosophical problems 
are particularly welcome. The principal aim is to 
publish articles that are models of clarity and 
precision in dealing with some significant philo- 
sophical issues. Articles in the journal are Intellig- 
ible to philosophers whose expertise lies outside 
the subject matter of the article. it is intended that 
a diligent reader of the journal will be kept informed 
of the major problems and contributions of con- 
temporary analytic philosophy. 


Subscription Information ISSN 0031-8116 
1989, Volume 55—57 (9 Issues) 

Institutional rate: Dfl. 582.00/US$285.00 incl. p&h 
Private rate: Dfl. 234.00/US$ 103.50 incl. p&h 
Private subsoriptions should be sent direct io the publishers 


Volumes 18-22, 31, 34 may be ordered from: 
Swets & Zeitinger BV, P.O. Box 810, 2160 SZ LISSE, The Netheriands 


Back votume Information [s available upon request 


P.O. Box 322, 3300 AH Dordrecht, The Netherlands 


P.O. Box 358, Accord Station, Hingham, MA 02018-0358, U.S.A. 


KLUWER 
ACADEMIC 
PUBLISHERS 


wi 








Journal of Philosophical 
Logic 
Editors 


J. Michaeli Dunn, Dept. of Philosophy, Indiana 
University, Bloomington, IN, USA and B. C. van 
Fraassen, Princeton University, NJ, USA 


The Joumal of Philosophical Logic is the on 

ps specializi ng in philosophical logic and util- 

es formal methods of dealing with toples in logical 

theory. Subjects specifically Included are: 

contributions to branches of logical theory di- 
rectly related to philosophical concerns, such 
as inductive logic, modal logic, deontic logic, 
quantum logic, tense logic, free logic, logic of 
questions, logic of commands, logic of condi- 
tionals, many-valued logic, relevance logics; 

— contributions to philosophical discussions that 
utilize the machinery of formal logic, as in recent 
treatments of abstract entities, non-existent 
possibles, essentiallsm, existence, proposl- 
tional! attitudes, meaning, and truth; 

— discussions of philosophical issues relating to 
logic and the logical structure of language, as 
for example, conventionalism in logic, ontic 
commitment, logical or semantic paradoxes, 
the logic of hypotheses and of presuppositions, 
constructivism, and extensionality; 

— philosophical work relating to special sciences 
(for example, linguistics, history of logic, or 
n with emphasis on foundational prob- 

ms, and making use of logical theory. Treat- 
ments of this kind Include universal grammar, 
pragmatics, conceptions of possibility, theories 
and mathematical truth in the history of phi- 
losophy, formalization of scientific theories, 
logical structures in quantum mechanics. 


Subscription Information ISSN 0022-3611 
1989, Volume 18 (4 issues) 

Institutional rate: 

Dfl. 240.00/US$117.50 incl. p&h 

Private rate: . 

Dfi. 88.00/US$ 36.00 inci. p&h 

Special rate for ASL Members: 

Dfl. 66.00/US$ 26.00 incl. p&h 


Private subscription should be sent direct to the publishers 
Back volume informahon is available upon request 





P.O. Box 322, 3300 AH Dordrecht, The Netherlands 


P.O. Box 358, Accord Station, Hingham, MA 02018-0358, U.S.A. 


ye 





TII 





KLUWER 
ACADEMIC 
PUBLISHERS 


jé 














| 


Biology and BIOLOGY &. 
Philosophy PHILOSOPHY 
Editor 

Michael Ruse, 


University of Guelph, Canada 


Associate Editors 


Francleco Ayala, University of California, vase. 
USA; Robert Haynes, York University, Canada 
David Hull, Northwestern University, USA 


The journal is devoted to philosophical issues 
arising from the life sclences. It is aimed at a broad 
readership, drawn from both the sciences and the 
humanities. it subscribes to no specific school of 
biology or philosophy, welcoming submissions 
from authors of all persuasions, and disciplines. 
The tools and results of history, sociology, and like 
subjects are welcomed, inasmuch as they can 
carry forward discussion and understanding. To 
this end, the editors are drawn evenly from biology 
and philosophy, and the editorial board has repre- 
eee from all parts of the world. 


. this naw journal is to be welcomed as a vehicle 
t serious exchange on the philosophical founda- 
tions of biology and the philosophical and con- 
ceptual implications of biological work. It is likely 
to become fairly important In fostering Interdis- 
ciplinary Interaction. It offers both peer and 
philosophers a useful vehicle for publi nanda 
reasonable centralized locus for following the 
interactions between their disparate disciplines. 
Quarterly Review of Biology 


oe and Philosophy has started well. 
will perform a genuine service. 
“this is a joumal that no “thoughtful biologist” 
STUDI ignore.’ 
Nature, Sept. 1987 


Subscription Information ISSN 0169-3867 
1989, Volume 4 (4 issues P 

Institutional rate: Dfi. 224.00/US$ 110.00 incl. post- 
age/handling 

Private rate: Dfl. 89.00/US$34.00 incl. postage/ 
handling 

Special rate for ESS members: Dfl. 80.00/US$29.50 
incl. postagelhandling 


P.O. Box 322, 3300 AH Dordrecht, The Netherlands 


P.O. Box 358, Accord Station, ES UEM SUC MICUNAT MAS MA 02018-0358, U.S.A. 








KLUWER 
ACADEMIC 
PUBLISHERS 


jt 














Essler, Universitat Frankfurt, FRG 


Managing Editor 
Wolfgang Spohn, /nstitute fiir Philosophie, University of 
Regensburg, FRG 


Erkenntnis ls a philosophical journal publishing papers on 
foundational studies and sclentiflc methodology covering 
me following areas: 
that fleld' of philosophy assoclated today with the 
notions of ‘Philosophy of Science’ and ‘Analytic 
Philosophy’ (in a wide sense); 
the philosophy of language, of logic, and of mathe- 
matics; 
the foundatlonal problems of physics and of other 
natural sciences; 
the foundations of normative disciplines such as 
ethics, philosophy of law and of aesthetics; 
the methodology of the social sciences and the 
humanities; 
the history of sclentific method. 
Erkenntnis was first published in 1930, and succeeded 
Annalen der Philosophie, which appeared from 1919-29, 


Subscription Information ISSN 01865-0106 

1889, Volume 30—31 (6 issues) KLUWER - 
institutional rate: Dfl. 476.00/US$233.00 inci. p&h ACADEMIC 
Private rate: Dfi. 184.00/US$88.00 incl. p&h PUBLISHERS 





Private subscriptions should be sent direct to the 
publishers 


P.O. Box 322, 3300 AH Dordrecht, The Netherlands 
P.O. Box 358, Accord Station, Hingham, MA 02018-0358, U.S.A. 


Erkenntnis 

Editors 

Cari Q. Hempel, Princeton University, USA; Wolfgang 
Stegmüller, Universität München, FRG, Wilhelm K. 





History of Science 


the only review of literature and research in the history of science, 
medicine and technology in its intellectual and social context 


1962-1988 












A special post-free offer to new subscribers 
valid until 31 December 1989 


Volumes 1-26 hardbound, US $693/£346.50 


From Science History Publications Ltd, Halfpenny Furze, 
Mill Lane, Chalfont St Giles, Bucks HP8 4NR, England 


History of Science is published quarterly in issues of about 112 
pages. 
Volume 27 (1989) will include 


JAMES LONGRIGG  Presocratic Philosophy and Hippocratic 
Medicine 


Mario BraGioLi The Social Status of Italian Mathematicians, 
1450-1600 


N. Wise and Crosse Smita Lord Kelvin on Science, Work and 
Beauty 


- M. MICALE Hysteria, Past, Present and Future 


RALPH COL?, Jr Charles Darwin's Past and Future Biographies 


' WILLIAM CLARK On the Dialectical Origins of the Research 
Seminar 


K. CLEAVER Adam Smith on Astronomy 


Dav P. MILLER “Into the Valley of Darkness”: Reflections on 
the Royal Society in the Eighteenth Century 


The Annual Subscription is US $97.00 post-free in the Americas and Japan, 

£48.50 elsewhere ($47:00/£23.00 direct to private subscribers). Write to: 

Science History Publications Ltd, Halfpenny Furze, Mill Lane, Chalfont St 
Giles, Bucks HP8 4NR, England 











GRAZER 


PHILOSOPHISCHE STUDIEN 


BAND 33/34 


VOLUME 33/34 


WITTGENSTEIN IN FOCUS - IM BRENNPUNKT: WITTGENSTEIN 
Edited by Rudolf HALLER and Brian McGUINNESS 


Als Erstveróffentlichung 


The first time in print 


21 Briefe von Gottlob FREGE an Ludwig WITTGENSTEIN 


Aufsätze von 


Johannes BRANDL 
Rosaria EGIDI 
Guido FRONGIA 

: Aldo GARGANI 
Rudolf HALLER 
S. Stephen HILMY 
Hidé ISHIGURO 
Jobn McDOWELL 


Herausgeber 


Articles by 


David PEARS 
Colin RADFORD 
Joachim SCHULTE 
Hans SLUGA 
Antonia SOULEZ: 
Ernst K. SPECHT 


Editor 


Prof. Dr. Rudolf Haller, Institut für Philosophie, Universitat Graz, Heinrichstrafe 26, A-8010 Graz, 
Osterreich/Austria 





S 


Brit. J. Phil. Sci. 40 (1989) 429-441 Printed in Great Britain 


- THE LAKATOS AWARD LECTURE* 
The Nature of Reality 


MICHAEL REDHEAD 


Ever since the early days of the Scientific Revolution, the world described by 
science has been very different from the commonsense world of everyday 
experience. The world of the physical sciences is a world of entities endowed 
with the ‘real’ or primary qualities of mass, figure, motion, etc., a world devoid 
of the richness of human experience, the colours, tastes and smells we are all 
familiar with. These are the secondary qualities that are elicited by a long 
chain of cause and effect, originating in the external world imbued just with 
the primary qualities. Eddington summed up the situation with his famous 
‘two tables’, the one he sits at as he writes, solid, impassive, and the table of 
modern physics, mostly empty space with myriads of rapidly moving particles. 
Eddington was roundly attacked by the philosopher Susan Stebbing, for, 
among other things, daring to suggest that the ‘scientific’ table was more real 
than the 'ordinary' table. But of course Stebbing's arguments on this score 
would not convince the latter-day scientiflc realists, who take scientific 
statements as expressing putative truths about the world; not truths that are 
known indubttably, but in some sense warranted as scientifically acceptable, 
in the light of empirical adequacy, depth, coherence, unifying power and other 
desirable features of scientific theorizing. Such a realist stance is, of course, 
controversial, but I am going to assume it for our present purposes and seek to 
pursue the nature of Eddington's scientific table a little further than he took it 
himself. 

What about the ‘particles’ the table was made up of? At one level of 
description they are the atoms and molecules of the chemist, but at a closer 
level of description these are composite entities made up of still smaller 'bits', 
the electrons, protons and neutrons. But again we can pursue the analysis, 
seeing the proton and neutron as themselves made up of other particles, the so- 
called quarks and gluons. Then there are the particles of light, the photons, 
which in some way mediate electromagnetic interactions in the same way that 
the gluons mediate the strong nuclear forces. Particle physics has catalogued a 
considerable variety of these fundamental entities, tied together in unified 
symmetry schemes of ever greater generality. What are these particles 


* The lecture was delivered at the London School of Economics on 9 February 1989 to 
mark the Award for the book Incompleteness, Nonlocality, and Realism: A Prolegomenon 
to the Philosophy of Quantum Mechanics published by The Clarendon Press, Oxford. I 
am grateful to Jennifer Redhead who assisted in the presentation of Bell's experiment. 


430 Michael Redhead 


supposed to be like? On some accounts they are literally points imbued with 
mass, electrical and other charges that govern the scale of their interactions. 
On other accounts they are to be thought of as excitations of a field, as 
processes rather than enduring substances. And most recently they have come 
to be thought of as very small pieces of string wriggling around in space and 
time, but in some, still largely mysterious way, constitutive of the space and 
time in which they wriggle! 

All this is very interesting and relevant to the title of this lecture, but it is not 
what I am going to discuss. Whatever the ultimate nature of these atomic and 
subatomic entities, they are all supposed to obey a new type of mechanics 
called quantum mechanics, which seems on the face of it, quite different from 
the familiar Newtonian mechanics that governs the motion of large-scale 
objects such as billiard balls, or indeed tables! It 1s this mysterious new type of 
mechanics that I want to talk about. I will continue to refer to parteis, but 
without prejudice as to their detailed intrinsic nature. 

Quantum mechanics allows these particles to have properties, but with 
certain queer reservations. In the first place, although these properties can be 
measured in a quantitative way the possible results of measurement are not in 

' general any old numbers, but may be restricted to a discrete set of possible 
'quantized' values. This discreteness is what gives the new mechanics its name 
of quantum mechanics. For example, the possible energies for an electron 
bound in a hydrogen atom are quantized in this way (Figure 1). There is a 
ground state and then excited states. The discrete spectral lines radiated by 
hydrogen arise as the electron jumps from one energy level to a lower one, 
emitting the released energy as a photon of radiation. , 

But there is a second queer feature. The state of a particle in quantum 
mechanics does not, in general, allow us to predict what the result of 
measuring some physical magnitude will be, but only the probability that one 


























FIGURE 1. 


The Nature of Reality 431 


value will turn up rather than another. Quantum mechanics provides a set of 
rules for calculating these probabilities for all states and all physical 
magnitudes, together with further rules telling us how the state and the 
associated probabilities vary with time. 

So far we have talked about the probabilities of measurement results. But 
what can we say about the magnitudes when they are not being measured? 
This is the central question posed for the interpretation of quantum mechanics. 
The traditional answer of the so-called Copenhagen interpretation is that the 
physical magnitudes are not even defined except in the context of an 
experimental arrangement appropriate to measuring them. But this raises 
severe problems about the status of the physical description of the measuring 
apparatus itself. Put bluntly, should not that only be well-defined in the context 
of measuring the measuring apparatus and so on in a potentially infinite 
regress, that may only terminate when the measurement result registers in the 
consciousness of a human observer. Another view is that something is always 
well-defined, but it is not in general the value of a magnitude, but the 
potentiality or propensity to produce one of a range of possible values when 
measurement is undertaken. On this view measurement consists in actualizing 
possibilities or potentialities. But again there is a fundamental problem 
associated with the fact that the formalism of quantum mechanics allows 
macroscopic objects like cats or tables to exist in states of potentiality rather 
than actuality for observables like being alive or dead for the cat, or the table 
being located here or at the other side of the room! 

To escape these sorts of problems much the most straightforward approach 
would be to assume that all magnitudes in all states possess sharp precise 
values just as in classical physics, which are simply revealed by measurement, 
so there are no mysteries like Schródinger's dead and alive cat. The 
probabilities in the quantum-mechanical formalism are epistemic, just 
measuring our ignorance of what the precise state of affairs actually is. So 
could quantum mechanics be interpreted as Just a sort of glorifled statistical 
mechanics? 

In 1935 Einstein, Podolsky and Rosen produced a famous argument, 
suggesting that this might indeed be the case. They considered, in essence, a . 
‘black box’ (see Figure 2) producing pairs of particles in a state where quantum 
mechanics could only predict probabilities for the outcomes of measurements 
of appropriately chosen properties for the individual particles. There is another 
feature of the experiment we need to know. Measurement of the same property 
on the two particles are exactly correlated. Let us represent the two particles by 
two red discs, the colour indicating the property we are discussing, and let the 
outcome of measurement of this 'red' property have two possible (quantized) 
values +1 and — 1. Then the correlation property for the two particles I just 
mentioned means that the pair is (R+,R+) or (R —,R —), where the noi tion, 
indicates red — +1 for the left-moving particle with red — + 1 orgie tigho > PES 


" 
‘alt 
ate 


432 Michael Redhead 





Bill Jack 


FIGURE 2. 


moving-particle, and similarly for the red = — 1 correlation. But, aside from the 
correlation, an important feature of the experiment is that the measurement 
results for an individual particle are distributed quite randomly. 

So, I now want to prove to you that even though quantum mechanics only 
prescribes a probability for getting red= + 1, for the left-moving particle and a 
complementary probability for getting red — — 1, that nevertheless the left- 
moving particle does in fact possess one or other of these two values before we 
actually look to see which it is. Mutatis mutandis we can make the same claim 
for the right-moving particle, but let us concentrate for the moment on the case 
of the left-moving particle. 

I know that if at some time t; ^ to, the time when the particles are produced 
by the box, I look at the right-moving particle, then I have put myself in a 
position to predict at any time t? t; whether the left-moving particle is + or 
— . Thus suppose I look at the right-moving particle and see that itis + , then I 
know that, if I look at any subsequent time at the left-moving particle it will 
also show +, and vice versa if the right-moving particle had shown —. Now 
Einstein claimed that accurate prediction of this sort could only be expected if 
there was what he called an 'element of reality' associated with the predicted 
value, and revealed by the subsequent observation. It is just the same 
argument as that tables really exist when I turn my back. I can predict reliably 
that I will see them again when I look again (assuming no outside interference 
like the broker's men removing the table when my back was turned). Of course 
Einsteln's sufficient condition for identifying elements of physical reality is a 
metaphysical assumption, famously denied by Bishop Berkeley for example. 
But give or take the odd idealist in the audience, I expect you will all feel able to 
go along with Einstein's argument up to this point. 

So at this point we have proved the existence of a deflnite value for the red 
property of the left-moving particle at any time subsequent to tı, when we 
inspected the right-moving particle. But now, queries Einstein, what can we 
say about the value for the left-moving particle for times between to and tı? 
There are two alternatives, either the definite value existed throughout this 
period, or, it did not exist prior to t;, but was brought into existence by some 


The Nature of Reality 433 


spooky action-at-a-distance originated by the event of our looking at the right- 
moving particle. Einstein totally rejected this second alternative and hence 
concluded that the definite value existed at all times since the emission from 
the box. 

But now the argument takes on an extraordinary and ironic twist. I shall 
show that the very same black box that Einstein used to demonstrate the 
existence of definite values for the properties, assuming no spooky action-at-a- 
distance, can be used to demonstrate that spooky action-at-a-distance must 
actually take place. 


Schematically, 
no-action-at-a-distance 
— possessed values 
—action-at-a-distance 


and the conclusion from that argument by staight logic is that there is spooky 
action-at-a-distance. And had Einstein lived to learn of this result, which was 
not discovered until 1964 by John Bell, he would not have been at all pleased, 
since his most famous creation, relativity theory, was also predicated on the 
impossibility of transmitting instantaneous effects at a distance! 

But before jumping ahead to discuss the implications of this extraordinary 
state of affairs, let me go through a version of Bell’s argument, which I have 
developed espectally for this lecture. The idea is to consider two properties for 
each of the particles instead of just one as in the original Einstein argument. So 
we will consider a red and a green property for the left-moving particle, and a 
blue and a yellow property for the right-moving particle. Representing the 
particles as discs, each side can be regarded as painted the appropriate colour, 
Since we are considering different properties for the two particles we no longer 
have the perfect correlations of the Einstein argument, but nevertheless there 
will be some quite specific, less than perfect correlation, for the pairs of results 
for each of the four ‘colour ways’, red-blue, red-yellow, green-blue and green- 
yellow, that can be calculated quite explicitly by the rules of quantum 
mechanics. To be more specific, suppose we consider N pairs of particles, and 
let the number of pairs in the red-blue colour way with the values + and + be 
denoted by N(R +,B+) and similarly for all the 16 ways of co-ordinating the 4 
possible pairs of outcomes for each colour way, with the 4 colour ways. 


N(R+,B+)+N(R—,B—)+N(R+,B—)+N(R—.B+) 
+N(R+,Y+)+N(R—,Y—-)4+N(R+,Y—)+N(R—,Y+) 
+N(G+,B+)+N(G—,B—)+N(G+,B—)+N(G—,B+) 
+N(G+,Y+)+N(G—,Y—)+N(G+,Y—)+N(G—,Y+) 
=4N (1) 


Now suppose we change some of the signs so as to subtract some of the terms 


434 Michael Redhead 


on the L.H.S. of (1). The result must be < 4N? But now I raise the question. Can 
we judiciously juggle the + and — signs on the L.HLS. of (1) so that we can 
tighten the bound below 4N? The answer is ‘Yes’. I will now demonstrate to you 
that the following expression (in which I have rearranged the order of the 
terms so as to start with all additions followed by all subtractions) is always 
<2N 


N(R+,B+)+N(R—,B—)+N(R+,Y+) 
+N(R—,Y—)+N(G+,B+)+N(G—,B—) 
+N(G+,Y—)+N(G—,Y+) 
—N(R+,B—)—N(R—,B+)—N(R+,Y—) 
—N(R—,Y+)—N(G+,B—)-—N(G—,B+) 
—N(G+,Y+)—-N(G—,Y—) 
<2N (2) 


In passing you may wonder whether analagous results hold for the general 
nxn'case where n is the number of properties considered for the left-moving 
particle and n’ the number of properties considered for the right-moving 
particle. It is easy to show that we cannot get anything of this sort for the 1 x 1 
case or the 2 x 1 case. The 2 x 2 case is, of course, the one we are considering. 
For 2 x 3, and 3 x 3, etc., some interesting results have been obtained using the 
powerful methods of linear programming and convex analysis, but this is a 
topic that is currently under active investigation, and serves to demonstrate 
how a particular result apparently ‘discovered’ by a physicist, can often be 
located in an esoteric, but essentially well-known mathematical framework. 

To return to our main task, I will prove (2) to you by actually doing the 
experiment, trying a few pairs of particles, and seeing how things turn out. 

So here comes the first pair of particles. It has R+ and G— for the left- 
moving particle and B— and Y + for the right-moving particle. So we fill in the 
first row of Table 1. We note that for this one trial the L.S.H. of (2) is just 2, so 
the bound has not been broken for N — 1. Now here comes the second pair of 
particles. It has R— and G— for the left-moving particle and B+ and Y + for 
the right-moving particle. So we can fill in the second row of Table 1. 

Adding the two rows gives O for the L.H.S. of (2), so again the bound is not 
broken for N= 2. But now we can see how things are turning out. It is easy to 
check that each row of the table contributes either + 2 or —2, whatever the 
assignment of values to colours, so adding N rows can never get bigger in 
magnitude than 2N, which is what the inequality (2) says. 

But now let N get very large (tend to infinity). Then N(R+B+)/ 
N-—Prob(R +,B+), the probability of getting outcomes + and + for the red- 
blue colour way, and so on for all the 16 terms in (2). So we end up with the 
result 











(-—a'+y) cn —9) (-8-—9) 
N N 


I 878v, 


(-0—9)|(-a-9)| (FA —i1)0|C7A'93)| Cg 7) (-X'+9) 
N N N N N N 


pengns 


(-A'-9)| CF AD) 
N N 














(+a°+9)|(—A'— WI] (+x +8) | (-a'-w) | (++) 
N N N N N 








436 2 Michael Redhead 


| Prob(R -,B--) - Prob(R —,B — ) - Prob(R 4-, Y +) 

+Prob(R —, Y —) 4- Prob(G 4-,B 4-) - Prob(G —,B—) 

-- Prob(G + , Y —) -- Prob(G —, Y +) 

— Prob(R 4-,B —) — Prob(R —,B --) — Prob(R 4-, Y —) 

— Prob(R —,Y +)— Prob(G 4-,B — ) — Prob(G —,B 4-) 

— Prob(G --, Y 4-) — Prob(G —,Y —)| 

«2 (3) 


But the probabilities are what can be calculated by quantum mechanics, 
and for appropriate choice of the properties red, green, blue and yellow, the 
L.H.S. of (3) comes out at 292. ,in violation of the inequality. The experiments 
have been done and that is what actually happens. So how can this possibly 
be? Either the particles do not have the definite values inscribed on them (but 
the Einstein-Podolsky-Rosen argument shows that they do) or if they do these 
values must flip, at any rate in some cases, from + to — or — to +, according 
to which side of the remote particle is being inspected. But that is spooky action- 
at-a-distance with a vengeance. 

To illustrate: Here I have a left-moving particle showing R-+ and the right- 
moving particle shows Y+. Now look at the green side of the left-moving 
particle. It shows G—. Now inspect the blue side of the right-moving particle. It 
shows B—, with the left-moving particle still showing G—. What does the red 
side of the left-moving particle show? Could it have flipped to R—, when we 
changed what we were looking at for the right-moving particle from yellow to 
blue. Well, I am only a philosopher not a conjuror, so I cannot show that 
happen with cardboard discs (hold up R — in view of audience, but not of 
myself!). You need really expensive apparatus, with lasers and photons, to 
show it happen. 

There is. another vital piece of the story to add in here. We cannot look 
simultaneously at both sides of the disc. (This reflects Heisenberg's famous 
uncertainty principle, which prevents simultaneous measurability of certain 
quantities in quantum mechanics.) If that were not so, we could check the 
value assignments for all four colours on the two discs and then, whatever these 
assignments, the inequality (3) would follow. But if we cannot look at both 
sides simultaneously, then the value assigned to one colour on the left-moving 
particle, for example, could change if we had looked at the colour on the right- 
moving particle that we did not actually look at. But expressed in this 
counterfactual way, are we being committed to action-at-a-distance, if the 
inequality ts violated? That is a slightly ticklish question in the logic of 
counterfactuals. 

Let me try out two examples on you. First, consider a clock at the far side of 
the room. I raise my hand just as it strikes six o'clock. I now ask the question: ‘If 
I had not raised my hand would the clock still have struck?’. Assuming no 


The Nature of Reality 437 


spooky action-at-a-distance, I think you would all agree the answer is ‘Yes’. 

But now consider an atom of radium instead of the clock. I raise my hand 
and it decays at six o'clock. Again I ask ‘If I had not raised my hand would it 
have decayed at the same time as it actually did, viz. six o'clock?' It is not clear 
what the answer should be, if the decay is an Intrinsically indeterministic 
process. I run the world over again up to six o'clock with my band not raised, 
butthis time the atom might well not decay, not because of any spooky effect of 
my hand on the atom, but simply because that is what Indeterminism means— 
specifyIng the world up to six o'clock does not fix what actually happens at six 
o'clock! Or consider the conundrum of the indeterministic roulette wheel. T 
place my bet on No..16 and No. 17 turns up. Am I justifled, for a genuinely 
indeterministic roulette wheel, in exclaiming ‘If only I had placed my bet on 
No. 17, I would have won a fortune!’ 

All this suggests a way out of our predicament. Just suppose that the +’s and 
—’s are not fixed on each side of the disk but are varying in an indeterministic 
or stochastic fashion. Then I could not derive the inequality, and its violation 
would tell me nothing about action-at-a-distance. But there is a difficulty with 
this solution. Let us go back to the original Einstein experiment. When I found 
red— +1 for the right-moving particle, then whenever I look at the left- 
moving particle it will always show the same value, viz. +1, so the number 
cannot be changing in the random manner proposed, unless the fact of looking 
at the right-moving particle changed the behaviour of the left-moving particle 
so as to fix the subsequent value. But that would be spooky action-at-a- 
distance, the very thing we are trying to avoid. 

So, what can we say about this action-at-a-distance that seems endemic to 
quantum mechanics, and how can it be reconciled with the theory of 
relativity? : 

The first surprising, but very relevant, thing to notice is that the effect we are 
discussing cannot be used to transmit very fast, even instantaneous, signals. 
As I put it in my book there is no such thing as a Bell telephone! To see this, 
consider again the Einstein version of our experiment. Remember that, for the 
left-moving particle, for example, successive measurements yleld red — 4- 1 or 
red— —1 in a quite random fashion. The long-run frequences are stable, 
typically 50% for each outcome, but the sequence of outcomes has no pattern 
or order to It. It might look like + — + — — + — + etc. Now imagine Jack 
manipulating the right-moving particles. Can he do anything to tnform Bill, 
who we suppose to be observing the left-moving particles, of what he, Jack, is 
up to? The answer is ‘No’. All Jack can do is to note whether he is getting 
red= +1, or red — — 1 for his particle on a particular occasion. This enables 
him to know, instantaneously, what result Bill is going to observe. But that 
does not change anything for Bill. To achieve a change in the sequence that Bill 
is observing, Jack would have to transmit his information (using a conventio- 
nal telephone for example) warning Bill to put an absorbing screen in front of 


438 Michael Redhead 


every particle that was going to show red= — 1. Then the particles that got 
through for Bill to look at would all show red= +1, so the original random 
sequence would have got altered, but not spookily or instantaneously. The 
solution here is no more mysterious than the fact that when I walked into the 
lecture theatre at LSE you all learned instantaneously that my room at 
Cambridge was empty! But to use this knowledge to produce a physical change 
at Cambridge, like preventing students from knocking on my door, would 
require you to transmit your information, by conventional means, back from 
London to Cambridge. 

In fact it is possible to prove quite generally, from the formalism of quantum 
mechanics, that there is no way of violating what in my book I called statistical 
locality, Le. the impossibility of changing the marginal probabilities for 
measurement outcomes of any physical magnitude associated with one 
particle by any operation performed on the other particle. 

Now, in relativity theory it is the possibility of transmitting signals at 
superluminal speeds that is standardly regarded as prohibited. So already we 
have a clue that the nonlocality we have demonstrated in quantum-mechanics 
may not necessarily violate relativistic considerations. But let us probe this 
matter further. In standard philosophical discussions of relativity theory it is 
not signalling as such which is at issue but rather the impossibility of a causal 
process propagating itself at superluminal speeds. So what is the connection 
between causality and signalling? It 1s generally assumed that causality is a 
necessary and sufficient condition for signalling. I am going to argue that the 
connection is, in general, less simple, and needs to be tackled by introducing a 
necessary condition for causality, which I call robustness. 

Consider a spring connecting an object A to a rigid wall B (see Figure 3). 
Suppose A is pulled to the right a distance |, then the force exerted on B is k x1 
where k is a constant, which we will call the spring constant. We may regard 
the displacement] as the cause of the force k x 1 (which will be experienced by B 
at a suitably later time than the displacement of A as the elastic deformation 
propagates along the spring). The causal connection implies that there is a 





FIGURE 3. 


The Nature of Reality 439 


definite relationship of fixed mathematical form between the displacement of A 
and the force on B. In particular, the relationship, characterized by the spring 
constant k, does not depend on how the displacement | is brought about, for 
example it does not depend on whether Jack pulls the spring or whether Bill 
pulls the spring. This invariance property of the causal relation I will refer to as 
robustness. I claim it is a necessary condition for a causal connection. Suppose, 
in our example, that the spring constant itself depended on who pulled the 
spring, then the displacement alone could not be regarded as the cause of the 
force experienced by B. Of course we could restore the notion of cause by citing 
not just the magnitude of the displacement, but also who pulled the spring, but 
suppose this composite cause was again not robust, the spring constant 
depending on who pulled Jack who pulled the spring and so ad infinitum, then I 
claim we would be in a marsh-mallow world, where the notion of cause was no 
longer applicable. But that does not mean that Jack could not signal. Every 
time he pulls the object A a distance I, some force is experienced at B, although 
the magnitude might depend on an infinite regress of all the circumstances 
that brought about the event of Jack pulling the spring. 

So causality >robustness— signalling, 1.e. causality is a sufficient condition 
for signalling, but not a necessary one. This point needs some additional 
comment. Signalling does not imply a causal relation between displacement 
and force, but it does imply a causal relationship between certain other 
physical magnitudes, which I shall refer to as gisplacement and gorse. Gorse 
has the value zero if the force is zero, and the value one if the force is non-zero. 
We make a similar definition relating gisplacement and displacement. Then, in 
our marsh-mallow world, there is no robust relation between displacement 
and force, as we have seen, but there is a robust relation between gisplacement 
and gorse. Filling out the necessary condition of robustness with features of 
constant conjunction, contiguity and temporal succession allows us to infer a 
causal connection between gisplacement and gorse. 

From the foregoing discussion, we now have the following tmplications: no- 
signalling >non-robustness— absence of causality, for any conceivable pair of 
magnitudes. So can our no-signalling result in the quantum-mechanical case 
be used to infer the absence of a causal connection, and hence defuse the 
putative conflict with relativity? 

I fear not, because the argument so far has been predicated on determinism. 
The question of probabilistic causality involved in maintaining correlations 
between essentially indeterministic events is a more delicate one. I will merely 
state my conclusions on this question. Again we have causality — robustness, 
but now in general robustness7^signalling. (To be more precise this arises 
when the signalling perturbations do not allow sufficiently arbitrary control 
over the local marginal probabilities, a fairly typical situation in quantum 
mechanics.) So we cannot infer the absence of probabilistic causality from the 
no-signalling theorem. A separate investigation is necessary to check whether 


440 Michael Redhead 


the correlations in the Bell experiment do or do not satisfy the appropriate 
technical definition of robustness. Suffice it to say that in my book I 
demonstrated by direct calculation that the Bell correlations were non-robust 
and hence were not being maintained by a direct causal relationship between 
events consisting of physical magnitudes for the two particles taking values. 
So how are we to understand the mysterious Bell correlations? Abner 
Shimony has coined the phrase ‘passion-at-a-distance’ to capture the 
harmonious behaviour of the separated particles in the Bell experiment. In my 
own work I have tried to flesh out this idea with some mathematical precision 
via the notion of non-robustness. Let me give an illustration of what passion is 
like. Imagine that my two hands are correlated by a stick held between them. 
As I move my hands around there is a definite correlation, a fixed distance 
between them. But now suppose that when one of my hands is disturbed, 
however gently, a new correlation, a new distance between them, is 
established. Definite correlations are there if I do not poke or disturb the system 
in any way, but they are too delicate, non-robust as I put it, to reflect a causal 
connection. This, I believe, is how we should think of the Bell correlations. 
Notice the importance of indeterminism in this whole analysis. If there is 
underlying determinism, I think that there is no plausible way of avoiding the 
conclusion that superluminal causal action is at work in the Bell experiments, 
not indeed between events consisting of physical magnitudes taking values, 
but linking the value of a physical magnitude for one particle to the 
experimental environment confronting the other particle. Notice that this does 
not conflict with the formal no-signalling theorem, which ts a statistical result 
and does not apply to what happens on a particular occasion. Moreover note 
that observation on a single pair of particles cannot be used to signal, since it 
can be shown that as soon as we inspect the state of one particle, the causal 
influence of the remote environmental context becomes inoperative. 
Passion suggests that the two particles are to be thought of as, in some sense, 
a single ‘whole’, rather than as two separate entitities. A substantial part of my 
book was concerned with developing in other ways the distinction between 
particles with individual properties, having these properties affected instanta- 
neously and spookily at a distance, and the idea that the particles do not really 
have individual properties at all, but only relativized to a holistic context. Such 
holistic ideas have been invoked previously by authors such as Niels Bohr and 
David Bohm in the context of understanding quantum mechanics. But my 
own work has differed from theirs in trying to provide a precise mathematical 
framework in which to express such ideas and draw the relevant distinctions. I 
believe that that is the way forward in trying to get a grip on the implications of 
modern physics for the metaphysical nature of reality. 
Let me conclude by quoting from a famous Princeton philosopher, writing in 
1986 in the preface of vol. 2 of his collected papers. ‘I am not ready to take 
lessons in ontology from quantum physics as it now is. First I must see how it 


The Nature of Reality 441 


looks when it is purified of instrumentalist frivolity and dares to say something 
not just about pointer readings but about the constitution of the world; and 
when it is purified of doublethinking deviant logic; and—most of all—when it 
is purified of supernatural tales about the power of the observant mind to make 
things jump.’ I submit that much of that work has in fact now been done and 
that philosophers should stand ready to evaluate the results. 


Wolfson College 
University of Cambridge 


Brit. J. Phil. Sci. 40 (1989), 443-457 Printed in Great Britain 


On the Necessity for Random Sampling 


D. J. JOHNSTONE 


1 Introduction 
2 Fisher's Definition of Probability 
3 The Probability (Level of Significance) P 
4 P-levels Without Random Sampling 
5 An Analogy 
6 Random Sampling and R. A. Fisher 
7 Random Sampling and Bayes' Postulate 
8 Random Sampling Remains Useful 
9 Practical Implications 
10 Conclusion 


I INTRODUCTION 


It is stressed repeatedly in the writings of orthodox (non-Bayesian) statisticians 
that research workers must not employ the methods described therein other 
than with samples drawn strictly at random. For example, the remarks of 
Fisher, Neyman, and Pearson, quoted below, can be considered authoritative 
statements of the classical standpoint: 


If the two groups have been chosen from their respective populations in such a 
way as not to be random samples of the populations they represent, then an 
examination of the samples will clearly not enable us to compare these 
populations; . . . (Fisher [1936] p. 58) 


It is known that if the sample is systematically selected and not drawn ‘at 
random’ the conclusions concerning the population x formed on its basis are, as 
a rule, false and at the present state of our knowledge impossible to justify. On the 
other hand, we know that justifiable and frequently correct conclusions are 
possible only when the process of drawing the sample is ‘random’, though the 
randomness may be at times more or less restricted. (Neyman [1937] p. 251) 


...arandom process must either have been purposely introduced or be assumed 
to have been present in the collection of data; . . . (Pearson [1947] p. 143) 


In very many fields of inquiry, samples drawn strictly at random are not often 
available, at least not at a reasonable price. Hence, in practice, research 


444 D. J. Johnstone 


workers may be disinclined to apply statistical methods, even when these are 
clearly applicable. Alternatively, statistical techniques are put into practice, 
but with samples which are not strictly random in the classtcal statistical 
sense.! 

According to most textbooks, it is indefensible to employ statistical methods 
with samples which are not strictly random. However, for many years, 
beginning notably with W. S. Gosset (pseudonym 'Student), who in 
correspondence with R. A. Fisher developed the common t-test, there has been 
some dissent in statistics concerning the edict that experiments (data) be 
randomized. The renowned subjectivist L. J. Savage [1962, pp. 88-9] wrote 
that he was not sure of any logical basis for experimental randomization, 
certainly not with Bayesian logic, nor within the conventional (Fisherian) 
logic for statistical inference. Most Bayesians accept that randomization is 
helpful methodologically, as a precaution against personal bias in sampling, 
but deny that the principle of randomization has any basis logically in 
programs for inductive inference (e.g., Kadane and Setdenfeld [1987], pp. 1- 
3). Beginning with Fisher, there have been repeated attempts to explain the 
basis for experimental randomization within a logic for inference, but the issue 
has not been properly resolved: 


Randomization is widely recognized as a basic principle of statistical experimen- 
tation. Yet we find no satisfactory answer to the question Why randomize? (Basu 
[1980] p. 575) 


Still deep within me, I have the feeling that the Interpretation is clearer, the 
conclusions are stronger and the analysis has greater validity if treatments have 
actually been assigned at random. But why? I do not know and none of the 
explanations which have been advanced are totally satisfactory to me. (Folks 
[1984] p. 30) 


Contrary to orthodox textbooks and all conventional wisdom, there is an 
argument in the philosophy of statistical inference that random sampling (and 
hence experimental randomization) is not logically necessary. In the literature 
in philosophy, this argument can be attributed primarily to Kyburg [1974, pp. 
360-1; 1976, pp. 365-75]; see also Leeds [1981, p. 84]. In statistics, see 
Lindley [1972, p. 39; 1983, p. 439; 1984, pp. 456-7] and Lindley and Novick 
[1981, pp. 51-2]. Kyburg and Lindley maintain independently that experi- 
mental randomization has no basis within the conventional (Fisherian) logic 
for inference. In what follows, I have attempted to elaborate and support this 
argument. The explanation proffered relates explicitly to statistical signifl- 
cance tests, but applies mutatis mutandis to estimation by way of confidence 


! A (simple) random sample without replacement is one selected such that each possible 
combination of size n is drawn with (equal) predeterminate relative frequency In the ‘long run’. 
On this, the classical definition, a sample is random depending not on what it includes, or looks 
like, but on the mechanism with which It was drawn (Kyburg [1974, p. 368]). 


On the Necessity for Random Sampling 445 


intervals; these are analagous logically to hypothesis tests (Cox and Hinkley 
(1974, pp. 212-14], and Seidenfeld [1979, p. 54]). 

The arguments of Lindley and Kyburg are conceptually equivalent but 
expressed in different terms. Lindley's explanation is in terms of 'exchangeabi- 
lity', as defined by de Finetti, whereas Kyburg's argument is expressed in 
Fisher's language of 'relevant' subsets. The relationship between these 
constructs is explained by Lindley and Novick [1981, pp. 46-8]. 


2 FISHER'S DEFINITION OF PROBABILITY 


Fisher was not an orthodox or 'classical' frequentist in the way of Neyman and 
his school. Contrary to the ordinary frequentist interpretation of probability, 
Fisher interpreted probability generally as a degree of certainty (or uncertainty) 
In a particular single case. Savage [1976, p. 462] conjectured that this is partly 
why the revered Fisher became more and more isolated from the mainstream 
of orthodox statisticians, of whom most could not comprehend or accept 
Fisher's unorthodox (single case) interpretation of probability. 

Fisher measured probability by direct inference from observed or analytical 
(ex hypothesi) relative frequencies. On his account [1956, p. 35; 1958, p. 263; 
1959, p. 23; 1960, p. 5; 1962, p. 530] the probability Pr(E) of the event E in 
the single case X is given by the relative frequency f(E) measured over a 
reference set R (where XeR), but strictly on the condition that R is free from 
any recognizable 'relevant' (with respect to E-ness) subset. By Fisher's 
definition [1958, pp. 263-7; 1959, pp. 23-6], a subset r from the reference set 
R is a ‘relevant’ (with respect to E-ness) subset if the relative frequency f(E) 
measured over r is not the same as f(E) measured over R; i.e., r is relevant if 
f(E|Xer) z f(E|XeR). Thus, for Fisher, a relative frequency f(E) is a legitimate 
probability in the single case X iff there is no recognizable subset r for which 
f(E|Xer) zf(E|XeR). This is Fisher's principle of ‘necessary ignorance’. In his 
words: 


The necessary ignorance 1s specified by our inability to discriminate any of the 
different sub-aggregates [subsets] having different frequency ratios, such as must 
always exist. (Fisher [1956] p. 36) 


The subject of a statement of probability must not only belong to a measurable 
set, of which a known fraction fulfils a certain condition, but every subset to 


? This Interpretation of Fisher follows Kyburg [1974, p. 72], Seldenfeld (1979, pp. 71-2] and 
Johnstone [1987a, pp. 482-4]. Neyman's position ts described in Johnstone [1987b, pp. 268-9: 
1988, pp. 354-5]. 

3 Strictly, this condition cannot be satisfied; It Is always possible to define relevant objects which 
include X. For example, suppose X belongs to the reference set R and the proportion of elements 
of R that have the property E is p. One definable subset which includes X and has a proportion of 
elements with property E not equal to p is (X) (RE). This set has 10096 E, or somewhat less, 
depending on whether X is E or not. Subsets such as this are artificial but not easy to rule out. 
(This point is due to an anonymous referee.) 


446 D. J. Johnstone 


which it belongs, and which is characterized by a different fraction, must be 
unrecognizable. (Fisher [1956] p. 60) 


3 THE PROBABILITY (LEVEL OF SIGNIFICANCE) P 


The level of significance P of the sample X is (by definition) the probability on 
the null hypothesis Ho of a sample as (or more) discrepant with Ho as that 
sample X observed. On Fisher's interpretation, consistent with his general 
definition, (i) the probability (or 'P-Level') P of the sample X represents the 
degree of certainty (on Ho), in the given single test, of a sample as discrepant 
with Ho as X itself,* and (ii) P is glven by the relative frequency with which 
samples in the reference set R are as discrepant with Ho as X, but (again 
strictly) on the condition that R, the reference set employed in the test, is free 
from any recognizable relevant (with respect to P) subset. 

The principle that the reference set R must be free from any recognizable 
relevant subset has of course the corollary that if R includes a recognizable 
relevant subset, say r, where Xer, then that subset is the more appropriate set 
on which to measure the probability Pr(E) in the single case X. Fisher wrote: 


Mantfestly, if the subject [single case] did belong to such a recognizable subset the 
latter would replace the original set, as the appropriate basis for the probability 
statement. (Fisher [1962] p. 530) 


Thus, if the reference set R is found to include a recognizable relevant subset r, 
where Xer, then R must give way to r. Or in terms of ‘conditionality’, R must be 
conditioned on the property which defines that subset r; ie., the property 
which is peculiar to those particular samples which comprise r. Hence Fisher's 
concern with ancillary statistics, the observed value of which, say a(X), may 
often define a relevant subset of samples within the reference set (sample space) 
of all possible samples. Fisher's remarks on relevant subsets, conditionality, 
and ancillary statistics [1955, pp. 71-2; 1956, pp. 86-92; 1960, pp. 6-7] 
constitute the logical antecedents for more recent principles of ancillarity, and 
other forms of partial conditioning (e.g., Cox [1958, pp. 359-63], Buehler 
[1959, pp. 846-7], Birnbaum [1962, p. 271], Cox and Hinkley [1974, pp. 38- 
9]; cf., Rosenkrantz [1977, pp. 216-20]. Lindley and Novick [1981, p. 48], 
Berger and Wolpert [1984, pp. 11-18]) and Lehmann [1986, pp. 539-65]. 


4 P-LEVELS WITHOUT RANDOM SAMPLING 


For example, let Xo — (x1, x2... Xn} be a sample, drawn from a population 
TI ~ N(u, 0), with ‘z-score’ z(Xa) = (€— Ho),/n/o equal to say 1:96. Conventio- 


* Fisher interpreted the level of significance (P-level) of the sample X as a degree of certainty in the 
single case, in the sense Carnap [1950, pp. 23-5] labelled 'probability;'. This interpretation of 
Fisher follows Kyburg [1974, p. 72], Seldenfeld [1979, pp. 71-2] and Johnstone [1987a. pp. 
484-5]. 


' On the Necessity for Random Sampling 447 


nally, the level of significance or 'P-level', P(X,), of the sample observed, Xobs 
is measured as follows: 


A: Given the null hypothesis Ho: u= Ho, 596 of simple random 
samples X drawn from II have |z| z: 1:96. 
Le., Pr{X:|z| 2 1:96;Ho) —:05. 

: Xos is a simple random sample from II. 

D 2(Xobs) = 1:96. 

: By definition, the two-tailed level of significance P(Xobs) 
equals Pr{X:|z| 2 |z(Xobs)|: Ho}. 

>E: P[Xous) = Pr{X:|z| 2 1:96;Ho] =-05. And similarly for any 

Xobs- 


Following this conventional logic, we proceed from premise A, which is merely 
an abbreviated statement of the usual sampling distribution, via a premise of 
random sampling, to the observed level of significance P(X.u.). Now consider 
an alternative logic by which we can measure P(X ,), beginning with the 
same premise A but doing without the conventional premise of random 
sampling: 


A: Given the null hypothesis Ho: u — po, 5% of simple random 
samples X drawn from II have |z| 2 1-96. 

B: Given the null hypothesis Ho: u— Ho, 5% of all possible 
samples X from II have |z| 21-96. 
i.e., Pr(X:|z] 2 1:96;Ho) --05. 

: Xos ls a sample from II. 

: 2(Xobs)= 1:96. 

: By definition, the two-tailed level of significance P(Xop.) 
equals Pr{X:|z| 2 [z(Xo»)|; Ho). 

>F: . P(Xæ)=:05. And similarly for any Xo. 


This alternative logic includes the additional inference AB that the set of all 
possible samples from IT is distributed (on Ho) identically (with respect to z) to 
the set of all simple random samples drawn from II. These sets are actually the 
same. That is, the set of all simple random samples (of size n) drawn from TI is 
logically the same set as the set of all possible samples (of size n) from II; by 
definition, the process of simple random sampling entalls selection of a single 
sample arbitrarily from a listing of all possible samples, each possible sample (n- 
fold combination) being listed just once. 

The logic for signiflcance tests without random samples, as set out above, 
may fall of course if we have knowledge that the sample X44, was not drawn 
with a simple random sampling mechanism, i.e., if we are aware of some bias 
in the sampling process. For example, if we know that the sample Xoy, was 
drawn from a peculiarly distributed (with respect to z) subpopulation of II (say 
2(X) is biased upward) then the direct inference that P(X4,)-—:05 ignores 


ous 


moo 


448 D. J. Johnstone i 


relevant information (the correct inference is that P(Xous) lies in the interval 
above -05 and less than or equal to 1, although without more information we 
cannot be more precise). However, it is not necessary that we know that Xobs 
was drawn strictly at random. All that is necessary logically is that we not have 
knowledge to the contrary. Given this ‘postulate of ignorance’ (Fisher’s term), we 
know only that Xo» belongs to a set of samples which exhibit the property that 
|z| 21-96 with relative frequency 5%. Hence, for us, conditioning objectively 
on that knowledge available, the probability P(X,5,) is ‘05. Indeed, to say that P 
takes any value other than *05, or (more popularly) that P is indeterminate, 
disregards what relevant knowledge we have; that is the knowledge that 
samples like (exchangeable with) Xo» exhibit z-scores greater than or equal to 
1:96 with a relative frequency of exactly 5%. 

The logic for tests of significance without random sampling is a logic of 
ignorance, being applicable properly only if our ignorance of the stochastic 
properties of the sampling process is such that the nominal or observed level of 
significance, P(Xp,), cannot rationally be revised. Following Fisher, P(X,) 1s 
interpreted as the probability of a specific event, e.g., the event that |z| 2 1:96, 
occurring in a particular single test. The validity of this interpretation depends 
explicitly on the class of tests to which that particular test is deemed to belong. 
and hence on what is known of that particular test, or (inversely) what ts not 
known. Moreover, P(Xo»s) is a logical probability in the single case only if the 
reference class is free from any recognizable relevant (pecullarly distributed) 
subset. This is emphasized frequently in Fisher's writings. For example: 


This fundamental requirement for the application to individual cases of the 
concept of classical probability shows clearly the role both of well specified 
ignorance and of specific knowledge in a typical probability statement. . . . The 
knowledge required for such a statement refers to a well-defined aggregate, or 
population of possibilitles within which the limiting frequency ratio must be 
exactly known. The necessary ignorance is specified by our inability to 
discriminate any of the different sub-aggregates having different frequency 
ratios, such as must always exist. (Fisher [1956] pp. 35-6) 


In a strict epistemological framework, where results are based objectively on a 
body of knowledge already established, ignorance can be an advantage. 
Specifically, it is easier to measure precisely the level of significance P of a 
sample observed if nothing whatsoever is known of the sampling procedure 
than if there is information suggesting some sort of bias. If there is evidence of 
bias, the P-value observed must be revised; e.g., a nominal 05 may be adjusted 
to an interval value such as [-05, 1], but if there is no basis upon which to suspect 
bias, then the P-value observed may properly be retained. This is a very 
important result given the prevalence in science, and other fields of inquiry, of 
non-randomized data, and data of unknown statistical pedigree. It should not 
be considered imperative that data be gathered under the protection of a 
mechanical randomizer; cf., Seidenfeld [1979, pp. 221-2]. Significance tests, 
and other conventional statistical procedures, may be applied legitimately to 


On the Necessity for Random Sampling 449 


historical and quasi-experimental data given suffictent ignorance of the subsets 
from which that data was drawn or to which it belongs. However, this 
ignorance must be bona fide. It remains a cardinal methodological sin to 
suppress or ignore relevant information, either actively or by leaving 
particular stones unturned; cf., Good [1974, p. 124]. 


5 AN ANALOGY 


Consider the problem of Peterson the Swede. (Ayer [1972, p. 51] cites the 
source of this problem.) Peterson is a Swede, and 95% of Swedes are 
Protestants. Given only this knowledge, what is the probability that the 
individual Peterson is a Protestant? Intuitively, if all we know is (i) that 
Peterson is a Swede (we do not know that his name is Peterson or anything 
about him), and (ii) that 95% of Swedes are Protestant, then the probability 
(degree of certainty) that the given individual (‘Peterson’) is a Protestant is :95. 
This answer is derived by the same logic as that proposed for significance tests 
without the requirement of strict random sampling. To see this analogy, the 
Peterson logic is set out in parallel with the logic for tests without random 
sampling: 


Tests Peterson 
A: 5% ofall possible samples A: 95% of all Swedes are 
from II have |z| z 1-96. Protestant. 
B: X is a sample from II. B: Peterson is a Swede. 
>C: The probability that X >C: The probability that 
has |z| 2 1:96 is -05. Peterson is Protestant is :'95. 


Referring now to the Peterson problem, the usual argument for random 
sampling (e.g., Kendall and Stuart [1958, p. 226]) is as follows. The Swede 
Peterson may possibly (we don't know) have been generated with a biased 
sampling mechanism; e.g., a mechanism which has the stochastic property of 
generating almost entirely Catholics. In this case, Peterson belongs to a 
relevant subset of Swedes, the set drawn with that biased mechanism, in which 
case the probability that Peterson is a Protestant equals not -95 but close to 
zero. Hence, unless we know that Peterson was drawn with a strict simple 
random sampling mechanism, it cannot be said that the probability that 
Peterson is Protestant is -95, because we cannot ignore the possibility that 
Peterson belongs to a relevant subset of Swedes generated with a biased (or 
weighted) mechanism. Moreover, a premise of random sampling makes all the 
difference, or at least it is seems very generally agreed, even by the most 
orthodox (frequentist) statisticians, that if Peterson had been selected at 
random, then it could be said (on the information available) that the 


450 j D. J. Johnstone 


probability that Peterson is a Protestant is 95.5 (How a strict frequentist would 
interpret this statement is not clear.) It does not seem to matter that Peterson 
may possibly (on the information available) belong to the set of Swedes who 
visit Lourdes, or to any other relevant subset, of which there must be many. 
Should it matter, therefore, when the sampling mechanism is unknown, that 
Peterson may possibly belong to a subset of Swedes defined (generated) by a 
biased sampling mechanism? With or without random sampling, there are 
relevant subsets to which Peterson may possibly belong. Moreover, if it is 
necessary to account for every relevant subset to which Peterson may 
conceivably belong, then logically all direct inference? is improper. That is, it is 
illogical to equate probability in the single case with any relative frequency, 
because there are always further relevant subsets which we have not sufflcient 
knowledge to recognize (Fisher [1956, p. 36; 1959, p. 23]). 

If in the case of ignorance the possibility of relevant subsets cannot be 
ignored, we can always add assumptions. If necessary, we can assume that 
every subset of Swedes to which Peterson belongs is stbchastically irrelevant; 
which implies, inter alía, an assumption of simple random sampling. But this 
would be ludicrous and demonstrably incorrect. Hence, if the mechanism by 
which Peterson was selected is unknown, there appears a simple choice: either 
we may conclude that the probability that Peterson is a Protestant is :95, 
employing logically and objectively what little information we have, or 
alternatively we can say that the probability that Peterson is a Protestant is 
logically indeterminate. This choice arises every day for insurance companies 
which have to decide, on the basis of frequency data, whether or not to insure 
individuals who, far from being drawn at random, often present themselves. 
Insurance companies, bookmakers, and others who rely on frequency data do 
not demand random samples. Similarly, if a gambler or other rational being 
were to guess either way on Peterson being a Protestant, knowing only that he 
is a Swede, she would guess of course that Peterson is indeed a Protestant, her 
reasoning being that the subjective probability, conditioned on that informa- 
tion presently available, that Peterson is a Protestant is very high, tn fact -95. 
This would appear a valid intuition. Moreover, if logic is merely good intuition 
or intuitive reasoning formalized, then the inference that the probability that 
Peterson is a Protestant equals -95 is not merely intuitive, but logical. 


5 Seldenfeld [1978, p. 710] notes that there Is quite general agreement over direct Inference under 
conditions of specified ignorance and random sampling. 

5 By definition, ‘direct inference’ equates logical or credal probability (Carnap's probability) with 
relative frequency (probabllity;). For example, ‘If we know that d: flips of coin a (by process P) 
constitute a kind of trial on which the chances are 0-4 that the coin lands heads-up, then (by 
Direct Inference) the credal probability is 0-4 that the 6th flip of a (by process P) lands heads-up’ 
(Seidenfeld [1982], p. 269). 


On the Necessity for Random Sampling 451 
6 RANDOM SAMPLING AND R. A. FISHER 


The principle that experiments (data) must be randomized is due, like much 
else in statistics, to R. A. Fisher. Yet notwithstanding his usual advocacy of 
randomization, the logic for tests without random samples appears explicitly in 
Fisher’s own writing. Specifically, Fisher condones direct inference without 
any requirement of random sampling. Given the observed frequency that 51% 
of all babies are boys, and no other information, Fisher [1958, pp. 263-4] 
infers that the probability (degree of certainty) that a particular child, b say 
(e.g., the next baby born), will be a boy is 51. His requirement for this result is 
not that b is selected at random, but merely that the reference to which b is 
deemed to belong is free from any recognizable relevant subset: 


On inquiry at the registrar, we may find that in his experience, or in the 
experience of much larger numbers recorded by registrars in different parts of the 
world, a fixed proportion of the births has been of boys and the remainder of girls. 
Let us suppose he tells us that in 51 per cent the births are those of boys (a little 
more than 51 per cent tn most populations). To the registrar, the birth which is 
about to take place, though intensely important to ourselves, is just another 
birth. To him it belongs to this set of his experience of sex at birth, and he very 
properly informs us that the probability of a boy is 51 per cent, having made 
reference to this measurable reference set as the basis of his statement... . 

Secondly, we satisfy ourselves as to the existence of relevant sub-sets. I need 
not use the word ‘random’ because all I need say can be said under '(c).' [‘(c)’ 
being the requirement that ‘No relevant sub-set can be recognized.'] . . . (Fisher 
[1958] p. 264) 


7 RANDOM SAMPLING AND BAYES' POSTULATE 


The argument that random sampling is not necessary given sufficient 
ignorance of the sampling mechanism may seem to presume or imply 'Bayes' 
postulate' (the assumption under ignorance of equal probabilities). This same 
argument, based explicitly on Bayes' postulate, has appeared previously in the 
literature, having been put forward by the British statisticians Kendall and 
Stuart as a possible subjectivist response to the conventional requirement of 
random sampling: 


If this matter is viewed from the standpoint of the subjective theory of probability 
the absence of knowledge about relationship between the method of selection 
and the characteristic under consideration may be sufficient to ensure random- 
ness, for the probabilities of elementary propositions then become equal—the 
probabilities being measures of prior attitudes of mind. But if the frequency 
viewpoint is adopted, It is not enough that there should be absence of knowledge 
of this kind, for unknown to the observer there may be relations which will ' 
prevent the elementary propositions from being true in approximately equal 
proportions. (Kendall and Stuart [1958] p. 226) 


452 D. J. Johnstone 


It is important not to confuse the argument put forward in this paper with that 
suggested by Kendall and Stuart. These are not the same, and neither reduces 
to the other. The difference is that the Kendall and Stuart logic includes a 
premise of simple random sampling, this premise being based upon Bayes’ 
postulate (the assumption of equal probabilities under ignorance), whereas the 
argument supported in this paper requires no such premise, however founded. 
Either way, the probability that Peterson is a Protestant is -95. But for Kendall 
and Stuart, this result requires the assumption of simple random sampling, 
whereas for Fisher there is no such assumption required or tmplied. 


8 RANDOM SAMPLING REMAINS USEFUL 


The philosopher Rudolf Carnap [1950, pp. 202-5] distinguished between logic 
and methodology. In Carnap's terms, lógic is rules or syllogism whereas 
methodology is theory or principles guiding the application of those rules. For 
example, Bayes' theorem is logic, whereas the principle that a rational agent 
should never invoke a prior probability of exactly zero or one (he should allow 
his posterior probability distribution to be influenced by the data) is not logic 
proper, but merely a basic principle for the application of a logic, and thus 
methodology. 

In Carnap's terms, Fisher's principle of randomization is not logical but 
methodological, for although tests of significance may go through logically 
without randomization, actual physical randomization is often useful. The 
function of randomization is to ensure as far as possible that the sample drawn 
does not belong to any relevant subset, whether recognizable or otherwise; cf., 
Finney [1964, pp. 327-8]. There are two ways by which a sample may fall into 
a relevant subset. These are: 


(i) bias built into the sampling procedure; e.g., a government survey of jobless 
conducted by telephone. 

(ii) chance; e.g., a government survey of jobless which happens unintendedly 
('by chance") to include mostly women, or mostly the better off. 


Random sampling precludes (i), by definition, and makes (ti) unlikely, the more 
so the larger the sample (the larger the sample, the more likely that the effects 
of uncontrolled covariates will even out). Presumably, it is the preclusion of (1) 
which had Fisher claim that randomization guarantees the validity of 
experiments: 


... it may be said that the simple precaution of randomisation will suffice to 
guarantee the validity of the test of significance, by which the result of the 
experiment is to be judged. (Fisher [1935] p. 21) 


The theory of estimation presupposes a process of random sampling. All our 
conclusions within that theory rest on this basis; without it our tests of 
significance would be worthless. In order to justify the conclusions of the theory 


On the Necessity for Random Sampling 453 


of estimation, and the tests of significance as applied to counts or measurements 
arising in the real world, it is logically necessary that they too must be the results 
of arandom process. In controlled experimentation it has been found not difficult 
to introduce explicit and objective randomization in such a way that the tests of 
significance are demonstrably correct. (Fisher [1947] pp. 435-6) 


This claim is clearly an overstatement, for there must always remain a chance 
(however reduced) of (ii), even in blocked or stratified designs, especially if 
there are numerous covariates not controlled. Taking an example from the 
fleld of statistical sampling in financial auditing, a strictly random sample of 
accounts receivable might, by chance, include only recent or active accounts, 
these having a lower default rate than the population as a whole. In this case, 
the probability P(X,&) of observing a default d(X) greater than or equal to 
d(Xops) is lower than the P-value observed, although not precisely determinate. 
Thus, despite experimental] randomization, the probability P(X,p.) may not 
withstand an ‘after-trial’ analysis (Good [1974, p. 123], Kyburg [1974, p. 

368], Seidenfeld [1979, p. 202; 1982, p. 271]). This was first explained in the 
literature by Gosset, Fisher's longtime associate: 


If, however, the arrangement is ‘randomized’ one can—before the draw—state 
accurately, subject to normality, etc., what the chance of getting any particular 
partition of variance between ‘treatment’ and ‘residual error’ will be in the ‘null’ 
case. After the draw, when one particular arrangement has been chosen, it is 
often possible to be sure that the chance has changed in one direction or another 
without, however, being able to define exactly what it is... This is analogous to 
the use of a life table to give the expectation of life. Thus the expectation of life of 
an Englishman of 40 can be referred to an appropriate table, but when we 
particularize the Englishman of 40 as a tin-miner or an agricultural labourer we 
know that the expectation is lower or higher than that given in the table without 
perhaps knowing very exactly by how much. (‘Student’ [1937] p. 367) 


9 PRACTICAL IMPLICATIONS 


If random sampling is strictly necessary, sclentists who employ orthodox 
statistical methods must ensure physically that the samples on which 
inferences are based really are random. Paradoxically, if the argument that 
random sampling is not strictly necessary is accepted, the scientist's methodo- 
logical obligations may remain much the same, for he must ensure that there is 
not attainable (on all current knowledge) any sufficient evidence of bias. In 
ordinary statistical practice, it may rarely be the case that a scientist and his 
fellows, after looking at the data, have no knowledge of the stochastic 
mechanism by which the data appeared. There may often be indications that 
the sample observed belongs to some relevant (biased) subset: 


Only in Philosophy does one know of Peterson only that he is a Swede. (Kadane 
[1988] 


454 D. J. Johnstone 


To ensure that there is not (and cannot be) any sufficient evidence of bias in the 
sampling mechanism, it may be necessary to put into effect actual physical 
randomization, as is required conventionally. Much depends, however, on the 
strength of evidence required to uphold any claim or suspicion of bias. Note 
that any sufficient indication of bias vitiates not only a Fisherian ‘postulate of 
ignorance’, but also the conventional assumption of random sampling. 

Perhaps the most fundamental implication of the argument against the 
necessity for random sampling is that the scientist, or statistician, if required to 
defend his sampling procedure, as may be the case in a court of law, has only to 
plead that no (sufficient) evidence of bias could be found, notwithstanding his 
best reasonable efforts. He has not to demonstrate that an assumption of 
random sampling can be sustained, for no such assumption is required. The 
burden of proof is shifted. Rather than the scientist having to prove the validity 
of a crucial (yet often unwarranted) assumption, a litigant must demonstrate 
sufficient evidence of bias. 

Lindley [1986] maintains that concern with logical and philosophical 
foundattons may be justified in so far as these affect practice. The argument, 
herein, that random sampling is not necessary logically, may have no great 
effect on practical standards. However, it may affect the basis on which those 
standards are explained, justified, or defended in court. The novelty of the 
argument is not the standards implied, but the logic on which those (existing) 
standards are based. If orthodox textbooks can be taken as representative of 
current theory, there is no logical justification in conventional statistical 
thought for anything but strict random sampling. 


IO CONCLUSION 


There is rarely if ever a truly random sample in any field of scientific inquiry 
(Cochran [1976, p. 7], Kempthorne [1975, pp. 318-19; 1979, p. 136]). 
Nonetheless, the application of conventional statistical methods, especially 
tests of significance, is widespread. Most applied scientists employ conven- 
tional statistical methods unconcernedly provided the sample is not obviously 
or apparently biased. Logically, random samples are nice, but certainly not 
essential. All that is necessary logically is ignorance; ignorance of any 
information suggestive of bias.” Ignorance, however, is not bliss. There is 
always the danger that tests dependent upon ignorance of the sampling 
mechanism will be vitiated by new or further information. But this is so of all 
tests, for all tests are dependent to some extent on ignorance. If the truth be 
known, even the most 'random looking' samples or arrangements belong (like 


7 This 1s a far more realistic criterion than the requirement of random sampling. Keynes [1921, 
pp. 290-1] explained that it cannot ever be known that a sample is random in the classical 
sense, because the stochastic properties (propensities) of the sampling mechanism are manifest 
only tn ‘the long run’ (by when ‘we are all dead’). 


On the Necessity for Random Sampling 455 


Peterson) to innumerable relevant subsets, but not recognizable (on current 
knowledge) subsets; that is why they ‘look random’. Further information, 
however, may uncover further relevant subsets. Randomized or not, the 
validity of experiments depends always on the current state of knowledge.® 


Graduate School of Management and Public Policy 
University of Sydney 


8 Kyburg [1968, pp. 158-9] explains that the validity of experiments can be insured (given the 
necessary sample size or degrees of freedom) against certain information by tempering 
experimental randomization with appropriate experimental control (e.g., stratification or 
blocking); cf., Seidenfeld [1979, pp. 214-18]. 


REFERENCES 


Ayer, A. J. [1972]: Probability and Evidence. Columbia University Press. 

Basu, D. [1980]: ‘Randomization Analysis of Experimental Data: The Fisher Randomi- 
zation Test’ (with discussion), Journal of the American Statistical Association, 75, 
pp. 575-95. 

BERGER, J. O. and Wo.pert, R. L. [1984]: The Likelihood Principle. Institute of 
Mathematical Statistics. 

BraNBAUM, A. [1962]: ‘On the Foundations of Statistical Inference’ (with discussion), 
Journal of the American Statistical Association, 57, pp. 269—326. 

BUEHLER, R. J. [1959]: ‘Some Validity Criteria for Statistical Inferences', The Annals of 
Mathematical Statistics, 30, pp. 845-63. 

CARNAP, R. [1950]: Logical Foundations of Probability. 2nd ed. 1962. University of 
Chicago Press. 

Cocuran, W. G. [1976]: ‘Early Development of Techniques in Comparative Experimen- 
tation’ in D. B. Owen (ed.) On the History of Statistics and Probability, pp. 1-25. 
Marcel Dekker. 

Cox, D. R. [1958]: ‘Some Problems Connected with Statistical Inference’, The Annals of 
Mathematical Statistics, 29, pp. 357-72. 

Cox, D. R. and Hinxtey, D. V. [1974]: Theoretical Statistics. Chapman and Hall. 

Finney, D. J. [1964]: ‘Sir Ronald Fisher's Contributions to Biometric Statistics’, 
Biometrics, 20, pp. 322-9. 

FISHER, R. A. [1935]: The Design of Experiments. 8th ed. 1966. Oliver and Boyd. 

FisugR, R. A. [1936]: ‘“The Coeffictent of Racial Likeness” and the Future of 
Craniometry’, Journal of the Royal Anthropological Institute, 66, pp. 57-63. 

FISHER, R. A. [1947]: ‘Development of the Theory of Experimental Design’, Proceedings 
of the International Statistical Conferences (Washington) 3, pp. 434-9. 

FISHER, R. A. [1955]: ‘Statistical Methods and Scientific Induction’, Journal of the Royal 
Statistical Society, B, 17, pp. 69-78. ' 

Fisner, R. A. [1956]: Statistical Methods and Scientific Inference. 3rd:ed. 1973. Hafner. 

FISHER, R. A. [1958]: ‘The Nature of Probability’, Centennial Review, 2, pp. 261-74. 

FISHER, R. A. [1959]: ‘Mathematical Probability tn the Natural Sciences’, Technometrics, 
1, pp. 21-9. 

Fisuzg, R. A. [1960]: ‘Sctentific Thought and the Refinement of Human Reasoning’ 
Journal of the Operations Research Society of Japan, 3, pp. 1-10. : 


A 
EP d ; D d 
Jc ^ 
bu "ap. « ^ ‘ 
[8 bad dot. 
- "uh at S A 
Torys / 


456 D. J. Johnstone 


FISHER, R. A. [1962]: ‘The Place of the Design of Experiments in the Logic of Scientific 
Inference’, in J. H. Bennett (ed.) Collected Papers of R. A. Fisher Vol. 5. 1974. 
Pp. 528-32. Reprinted from Colloques Internationaux du Centre National de la 
Recherche Scientifique (Paris), 110, pp. 13-9. 

Forxs, J. L. [1984]: 'Use of Randomization in Experimental Research' in K. Hinkelmann 
(ed.), Experimental Design, Statistical Models, and Genetic Statistics: Essays in Honor of 
Oscar Kempthorne. Pp. 17-32. Marcel Dekker. 

Goon, I. J. [1974]: ‘Random Thoughts About Randomness’, in K. F. Schaffner and R. S. 
Cohen (eds.), PSA 1972: Proceedings of the 1972 Biennial Meeting Philosophy of 
Science Association, pp. 117-35. D. Retdel. 

JOHNSTONE, D. J. [1987a]: "Tests of Significance Following R. A. Fisher’, The British 
Journal for the Philosophy of Science, 38, pp. 481-99. 

JOHNSTONE, D. J. [1987b]: ‘On the Interpretation of Hypothesis Tests Following Neyman 
and Pearson’, in R. Viertl (ed.), Probability and Bayesian Statistics, pp. 267-77. 
Plenum. 

JOHNSTONE, D. J. [1988]: ‘Hypothesis Tests and Confidence Intervals in the Single Case’, 
The British Journal for the Philosophy of Science, 39, pp. 353-60. 

KADANE, J. B. [1988]: Private Communication. 

KADANE, J. B. and SEIDENFELD, T. [1987]: ‘Randomization in a Bayesian Perspective’, 
Working Paper: Carnegie-Mellon University. f 

KEMPTHORNE, O. [1975]: ‘Inference from Experiments and Randomization’ in J. N. 
Srivastava (ed.), A Survey of Statistical Design and Linear Models, pp. 303-31. North- 
Holland. 

KEMPTHORNE, O. [1979]: ‘Sampling Inference, Experimental Inference and Observation 
Inference’, Sankhya, 40, B, pp. 115-45. 

KENDALL, M. G. and STUART, A. [1958]: The Advanced Theory of Statistics. Vol. 1. 
Distribution Theory. 4th ed. 1977. Charles Griffin. 

Keynes, J. M. [1921]: A Treatise on Probability. Macmillan Press for the Royal Economic 
Society. 

Kysure, H. E. [1968]: Philosophy of Science: A Formal Approach. Macmillan. 

Kvsaung, H. E. [1974]: The Logical Foundations of Statistical Inference. D. Reidel. 

Kvaung, H. E. [1976]: ‘Chance’, Journal of Philosophical Logic, 5, pp. 355-93. 

Leens, S. [1981]: ‘Discussion: Kyburg and Fiducial Inference’, Philosophy of Science, 48, 
pp. 78-91. 

LEHMANN, E. L. [1986]: Testing Statistical Hypotheses 2nd ed. Wiley. 

LiNDLEY, D. V. [1972]: Bayesian Statistics: A Review. Society for Industrial and Applied 
Mathematics. 

LiNDLEY, D. V. [1983]: "The Role of Randomization in Inference’, PSA 1982: Proceedings 
of the 1982 Biennial Meeting of the Philosophy of Science Association. Vol. 2. Pp. 431- 
46. Philosephy of Science Association. 

LINDLEY, D. V. [1984]: ‘A Bayesian Lady Tasting Tea’ (with discussion), in H. A. David 
and H. T. David (eds.), Statistics: An Appraisal, pp. 455-85. Iowa State University 
Press. 

LiNpLEY, D. V. [1986]: Plenary Address at the International Symposium on Probability 
and Bayesian Statistics, Innsbruck 1986. 

LiNDLEY, D. V. and Novick, M. R. [1981]: ‘The Role of Exchangeability in Inference’, The 
Annals of Statistics, 9, pp. 45-58. 


Ds 


On the Necessity for Random Sampling 457 


NEYMAN, J. [1937]: ‘Outline of a Theory of Statistical Estimation Based on the Classical 
Theory of Probability’, A Selection of Early Statistical Papers of J. Neyman. Cambridge 
University Press. 1967. Pp. 250-90. Reprinted from Philosophical Transactions of 
the Royal Soctety of London, Series A, 767, 236, pp. 333-80. 

PEARSON, E. S. [1947]: ‘The Choice of Statistical Tests Illustrated on the Interpretation of 
Data Classed in a 2 x 2 Table’, Biometrika, 34, pp. 139-67. 

ROSENKRANTZ, R. D. [1977]: Inference, Method and Decision: Towards a Bayesian 
Philosophy of Science. D. Reidel. 

SAVAGE, L. J. [1962]: Discussion in L. J. Savage et al., The Foundations of Statistical 
Inference: A Discussion. Methuen. 

SAVAGE, L. J. [1976]: ‘On Rereading R. A. Fisher’ (with discussion), The Annals of 
Statistics, 4, pp. 441-500. 

SEIDENFELD, T. [1978]: ‘Direct Inference and Inverse Inference’ (with discusston), The 
Journal of Philosophy, 75, pp. 709-37. 

SEIDENFELD, T. [1979]: Philosophical Problems of Statistical Inference: Learning from R. A. 
Fisher. D. Reidel. 

SEIDENFELD, T. [1982]: ‘Levi on the Dogma of Randomization in Experiments’ in R. J. 
Bogdan (ed.) Henry E. Kyburg Jr. and Isaac Levi, pp. 263-91. D. Reidel. 

‘STUDENT’ [1937]: ‘Comparison Between Balanced and Random Arrangements of Field 
Plots’, Biometrika, 29, pp. 363-79. 


^A 


Brit. J. Phil. Sci. 40 (1989), 459-483 Printed in Great Britain 


Theory Structure and Theory Change in 
Contemporary Molecular Biology 


SYLVIA CULP AND PHILIP KITCHER! 


ABSTRACT 


Traditional approaches to theory structure and theory change in sclence do not 
fare well when confronted with the practice of certain flelds of science. We offer an 
account of contemporary practice in molecular biology designed to address two 
questions: Is theory change in this area of science gradual or saltatory? Whats the 
relation between molecular biology and the flelds of traditional biology? Our main 
focus is a recent episode in molecular biology, the discovery of enzymatic RNA. We 
argue that our reconstruction of this episode shows that traditional approaches to 
theory structure and theory change need considerable refinement if they are to be 
defended as generally applicable. 


1 Introduction 

2 Practice in Contemporary Molecular Biology 

3 The Discovery of Enzymatic RNA 

4 Understanding the Change 

5 Derivative Revolutions and Reductionism Revisited 








I INTRODUCTION 


Despite a plethora of critiques of what used to be known, quite aptly, as the 
‘received view’ of scientific theories, contemporary debates about intertheore- 
tic relations and about theory change are still often conducted as if scientific 
theories could be unproblematically identified as deductively closed sets of 
sentences. In our judgment, the attempt to find some small number of general 
laws that can serve as the axioms of the scientific theory under study has 
proved especially baneful in the biological sciences. Even where axiomatiza- 
tion has been achieved—as in the careful studies of Mary Williams on the 
theory of natural selection [1970, 1973]—there are serious questions as to 
whether the axiomatization captures what is central to the theory as biologists 


! This paper emerged from discussions between us, and we are both equally responsible for its 
errors. We would like to thank Yvonne Paterson for helpful comments. 


460 Sylvia Culp and Philip Kitcher 


actually understand it, use it, and develop it. Indeed, we believe that the continued 
smouldering problems about the ‘triviality’ or ‘unfalsifiability’ of neo- 
Darwinism are, in large measure, the product of thinking that there has to be 
some principle (or princtples) of great generality that formulate the core of the 
theory of evolution by natural selection (see Kitcher [1982], pp. 55-60). 
Similarly, in the debates about the reduction of classical genetics to molecular 
biology, the macro-level theory has been essentially divorced from what 
geneticists have been doing since approximately 1915 (let alone 1953) by 
proposing that classical genetics can be identified with the logical conse- 
quences of Mendel's laws (or minor adjustments of those laws). If any scientific 
theory worthy of the name is a deductively closed set of sentences whose 
axioms include generalizations that range over all the entities in the domain of 
the theory, then it seems that either biology has no theories or what theories it 
has are both trivial and irrelevant to the thinking and activity of almost all 
contemporary biologists. 

Nor are things much better if we abandon the syntactic «onception of 
theories in favor of its chief rival, the so-called ‘semantic view' "€ theories’, 
according to which a theory is a class of models. Although this approach is 
valuable in differentiating a theory from its formulations (Suppe [1972]), in 
clarifying the relation between theory and observation (Van Fraassen [1980]). 
and in reconstructing parts of important biological theorles (Lloyd [1984], 
Lloyd [forthcoming]), we believe that it encounters problems that are parallel 
to those besetting the older conception when it seeks to treat theories at a high 
level of generality. The task of specifying the class of models that is Darwinian 
evolutionary theory or the class of models that is classical genetics gives us no 
more adequate a plcture of these biological disciplines than does the task of 
specifying the axioms of these theories—and for the obvious reason that the 
new-style specifications of sets of models look remarkably like the old-style 
axioms (see Beatty [1980] and Glere [1979] for examples). Proponents of the 
semantic view have sometimes gone to considerable efforts to reconstruct the 
big biological theories and to re-examine the questions of inter-theoretic 
relations and theory change (see Balzer and Dawe [1986] for a laudably 
thorough attempt). However, like the endeavours on behalf of the ‘recetved 
view', their struggles seem to result in reconstructions tb^ "are quite remote 
from what has been going on in biology for the past fifty yéars. It is only a little 
unfair to conclude that some philosophical reconstructions of major biological 
theories reconstruct only the opening chapters of introductory texts.2 We 
shall offer a different approach to biology, keyed to a specific, illustrative 


? Ironically, while the general question of how to approach major biolezical theories remains 
confused, more specific studies of parts of theoretical biology are thriving as never before. 
Witness the important work of David Hull, Elliott Sober, John Beatty, Robert Brandon, John 
Dupre, William Wimsatt, Elisabeth Lloyd, Alexander Rosenberg, Richard Burian, and a number 
of other authors. 


Theory Structure/Change in Contemporary Molecular Biology 461 


example. We do not believe that the case we describe is atypical of biology— 
indeed, although we shall not defend a sweeping thesis here, we think that it 
represents the structure of many areas of science. Our starting point is the 
notion that, at any particular stage in the development of a science, there is a 
complex state of that science, the aggregate of all that the scientists working in 
the fleld at the time are thinking, saying, writing, and doing. Philosophers, 
intent on answering certain questions about the synchronic relations among 
sciences or the diachronic development of particular disciplines, try to 
reconstruct this state. Prevailing fashions in the philosophy of science suppose 
that we can distil from all the sayings and doings a theory that dominates the 
science of the time. We shall identify a multi-dimensional entity that we call a 
‘scientific practice’. Those who believe that a scientific theory could not be 
anything except a set of highly general laws and their logical consequences (or, 
if you like, a class of models) may conclude that we have not offered an account 
of the structure of biological theory. We will happily abandon the word, for 
what seems to us to be vastly more signiflcant is the delineation of the ways in 
which the practice of biological science achieves generality. 

Philosophical reconstruction should not be dominated by the preconception 
that there is an essence of metascientific concepts that awaits our exposure. 
Our phenomena are the sayings, writings, and doings of a group of scientists.’ 
There iş probably no all-purpose reconstruction of the phenomena that will 
capture the essence of the sclence and provide an idiom in which to formulate 
all possible epistemological and methodological questions. Instead, we should 
start with particular questions and look for an account of the phenomena that 
enables ug to focus those questions. In the present case, the background 
philosophical questions are ‘What are the relations between biological theories 
and theories in physics and chemistry?’ and ‘Are changes in biological theories 
gradual or saltatory?’ (Behind these formulations stand, of course, cloudier 
issues about reductionism and about revolutions in science.) We shall try to 
reconstruct the phenomena in a way that makes it possible to address these 
questions. 

. While our approach breaks with many traditional ideas about scientific 
theories in general and biological theories in particular, it is not bereft of 
ancestors. In his discussions of the character of normal science, Thomas Kuhn 
offered a multi-dimensional view of the state of a science at a time ([1970] 
Chapters 3—5), and our approach can be seen as articulating further some of 
the elements that Kuhn discerned as shared among a group of contempora- 


. * For the purposes of this essay, we do not venture into the tricky issue of how the group is to be 


demarcated. We also idealize the phenomena by concentrating only on those parts of the 
practice that are acknowledged by all members of the group. In fact, one of us (P.S.K.) believes 
that it ts important to take account of the cognitive diversity within a slice of science if one wants 
to address some fundamental problems about progress and rattonality in science. But we are 
concerned to advance one heresy at a time. 


462 Sylvia Culp and Philip Kitcher 


neous scientists. Moreover, there are affinities between our approach and 
ideas of Sylvain Bromberger [1963], Dudley Shapere [1974], and Kenneth 
Schaffner [forthcoming]. The present essay also articulates an approach that 
has been taken with respect to other biological examples and associated 
philosophical problems: the reduction of classical genetics to molecular biology 
(Kitcher [1984]), the resolution of the Darwinian revolution (Kitcher 
[1985a]), the relation between various ventures in sociobiology and contem- 
porary neo-Darwinism (Kitcher [1985b] chapters 1-4), and theory-change in 
cardiology (Lie [unpublished]). 


2 PRACTICE IN CONTEMPORARY MOLECULAR BIOLOGY 


Philosophers approaching molecular biology for the first time ought to be 
struck by the apparent amorphousness of the subject. General introductions to 
the field (Watson et al [1987], Darnell et al [1986]) are both dauntingly long 
and full of apparently disparate kinds of information. There is typically some 
attention to organic and physical chemistry, and metabolic and physical 
biochemistry. This is followed by a review of the main phenomena of 
prokaryotic and eukaryotic cell biology and genetics, and (less systematically) 
of development and physiology. Considerable space is given to the presentation 
of techniques for deriving information about the molecules present within a 
cell. Finally, the bulk of the work consists in bringing all the previous elements 
together: there is a whole mass of claims about the molecular interactions in 
particular processes in particular kinds of organisms, specifically about the 
processes that culminate in the formation of the biologically significant 
molecules, the nucleic acids and proteins. These claims are buttressed through 
the description of the experiments that have led to their adoption. Over- 
whelmed by all the detail, the philosopher searches in vain for the display of 
core laws, big generalizations analogous to Maxwell's equations, Schrod- 
inger's equation, or Newton's laws of dynamics. What makes all this one 
subject? 

Matters only become worse as one probes the technical literature. More and 
more detail is added to the study of gene replication and expression. The pages 
of Science, Nature, and Cell contain reports of discoveries that practitioners 
herald as being of theoretical importance—but they seem extraordinarily 
particular. Here a new RNA polymerase, there a suggestion of a conserved 
DNA sequence. Nourished on the philosophical staples about sclentific 


* Kuhn's notion of paradigm (or, later, disciplinary matrix) unfortunately served double duty, 
expressing both the idea that there is a complex of shared elements and the view that the history 
of a science can be divided into discrete units, punctuated by revolutions (paradigm shifts). As 
will become clear below, the former idea is quite independent of the latter—indeed, we believe 
that development of the first makes the acceptance of the second quite implausible. For further 
discussion of the relation between our notion of practice and Kuhn's conception of paradigm, 
see chapter 7 of Kltcher [1983]. 


Theory Structure/Change in Contemporary Molecular Biology 463 


theories, the bewildered reader fails to understand how any old generalization 
has been refuted or any new generalization proposed, and, in consequence, 
fails to see what makes the work ‘theoretically significant’. Molecular biology 
continues to look like a vast morass of unconnected detail. Where 1s the 
theory? 

Similar questions could easily be generated if our imaginary philosopher 
had started with classical genetics—the field that concerned itself with the 
transmission of hereditary traits—had chosen any point in the history of the 
subject between 1915 and 1953, and had resolutely read beyond the 
simplifying Mendelian formulations of the introductory chapter. For some 
philosophical purposes, classical genetics can be reconstructed by introducing 
the concept of a practice consisting of a language, a set of statements, a set of 
questions, a set of patterns of reasoning, a set of methodological directives, and 
a set of experimental techniques (see Kitcher [1984]). The language is that 
used by the community of geneticists, the statements are those that they 
accept, the questions are those that they take to be important to address, the 
patterns of reasoning are the forms of argument that they employ in answering 
those questions, the methodological directives are the standards by means of 
which they appraise contending solutions and evaluate proposed experiments, 
and the experimental techniques are the methods they employ in interrogating 
nature. We shall try to show how a similar approach can bring order to the 
apparent amorphousness of contemporary molecular biology. 

Let us start with the central problems that molecular biology sets for itself. 
Its grand project can be seen as that of tracing the chemical reactions that 
occur in biologically significant processes, where, for the molecular biologist, 
biologically significant processes are, paradigmatically, processes of nucleic 
acid replication, transcription, translation, and their control. The big vague 
question ts ‘How does P occur?’ where P is to be replaced by some description of 
a biological process. In contemporary work, the big vague question is made far 
more precise through the incorporation of particular views about what 
processes are biologically significant and about the chemistry of living 
organisms. 

The traditional disciplines of nonmolecular biology contribute to the 
specification. Start with genetics. Classical genetics recognizes the existence of 
information-bearing units, genes, that are copied, transmitted to gametes and 
combine in zygotes (in asexual organisms, the route into the next generation is 
typically more direct). These units, occasionally singly, typically in combina- 
tion, are associated with phenotypic traits of the organism, but, from the 
perspective of classical genetics, this association Is given (indeed, genes are first 
identified through their phenotypic effects, or, more exactly, through differen- 
tial phenotypic effects of their mutant alleles, so that the association of gene 
and trait is there from the start). To pry into the process through which the 
genes in concert with one another and with the environment give rise to the 


464 Sylvia Culp and Philip Kitcher 


phenotype of the organism is a task for developmental biology. So, from 
genetics proper, we take the important biological processes of gene replication 
and gametic segregation: the big vague question becomes ‘How do genes copy 
themselves?’, ‘How do gene copies get distributed to gametes?’. 

Developmental biology takes up the story where genetics leaves off. We can 
divide the enterprise into two unequal parts. The lesser task consists in 
accounting for zygote formation, understanding the fusion of sperm and egg, 
the incorporation of the DNA of the sperm within the nucleus of the egg. Far 
more onerous is the project of tracing the link between genotype and adult 
phenotype, showing how adult morphology is generated from the (relatively 
unstructured) zygote. With respect to particular organisms and particular 
types of structures—limb buds or nervous systems, for example—ontogenesis 
can be segmented into stages, and the developmental question can be further 
focused by asking how the structures present at one stage are transformed into 
those present at the next. Against the background presupposition that copies 
of all the genes present in the zygote are found in (almost) all the cells of the 
developing organism, the big vague question can be made more precise. ‘When 
do cells become differentiated?’, ‘What constitutes the differentiation of cells of 
a particular type?’, ‘What signals are involved in differentiation?’, ‘How do 
different genes become active in different cells?'. All of these questions are 
taken up in the context of molecular biology as demands for the specification of 
molecules and of molecular interactions. Differentiation is ultimately to be 
understood in terms of the molecular composition of cells, and this molecular 
composition is, in its turn, to be traced to the systems of gene control and the 
molecules that Interact with them. 

Besides its contributions to our understanding of processes important to the 
transmission of genes (spindle formation) and to the link between gene and 
phenotype (the functioning of the cytoskeleton), cell biology also contributes 
questions that have nothing to do with genetics or the link between genes and 
phenotypes. Cells have to engage in certain kinds of reactions so that the 
organism can function physiologically. Here we begin from the physiology of 
the organism and from some physiological phenomenon-—respiration, diges- 
tion, or muscular contraction, say—proceed to a categorization of the cell- 
level processes that are involved—for example, the taking up of oxygen in the 
lungs and the delivery of oxygen to the blood—and ask how the cells in the 
process are acting and what enables them to do the things they do. As with 
the phenomena of ontogenesis, the molecular questions come into focus once 
we have achieved a decomposition of the process, identifying it as a complex of 
subprocesses involving individual cells. At that stage, we can begin to inquire 
about the chemical compositions of the cells and about the molecular (or 
physical) interactions among them. 

We are now able to introduce some structure into molecular biology by 


Theory Structure/Change in Contemporary Molecular Biology 465 


seeing that the big vague question breaks down into more precise questions 
along the lines depicted in Figures 1 and 2. Our use of two illustrations here 
suggests a contrast between molecular biology as currently practiced and 
molecular biology as it might ultimately develop. On the former interpretation, 
the biologically significant processes are simply the current paradigms— 
nucleic acid replication, gene transcription, protein synthesis, and so forth. 
The second view is broader, bringing within the domain of molecular biology 
the processes of intermediate metabolism by decomposing them into com- 
plexes whose elements are the kinds of processes that are currently studied by 
molecular biologists. Across all these families of questions there cuts a 
common. set of presuppositions about the chemical reactions that occur in 
living organisms. The presuppositions begin with claims about the kinds of 
molecules found in living organisms, and about the chemistry of these 
molecules (organic chemistry). They proceed to claim that typical reactions 
found in living things are ‘uphill reactions’, reactions that need to be catalyzed 
if they are to occur. Hence, for typical reactions in organisms, it will be 
necessary to specify a catalyst, an enzyme. Finally, in the molecular biology 
that held sway up to five years ago, there was a further presupposition: 
enzymes are proteins. 

Understanding these presuppositions, we can proceed to specify the normal 
form of a problem solution in molecular biology. We start with some 
biologically significant process P and with our big vague question ‘How does P 
occur?’. The normal form of an answer is as follows: 


[1] P is composed of n elementary processes P;,..., P,. 

[2] For P; the reactants are r31, . . . , 1:1, the enzyme is e3; for P; the reactants 
are r21» .. .. 122) the enzyme is e? . . . (continue through all the P). 

[3] For P, the products are R11, . . . , Rap; for P; the products are R21, . . . , Rap; 
(continue through all the P). 

In accepting this as a complete answer to the question, we make the 
following demands: (a) for each P,, the specification of the products must 
be obtained from the specification of the reactants for that P, by applying 
standard principles of chemical kinetics; (b) the reactants for Pı must be 
included among the products of Pj (c) there must be appended to [1]-[3] a 
specification of the initial and final states of the organism showing that the 
reactants rı are Initially present and that the final state consists in the 
presence of the Ry, products.* 


Let us illustrate this abstract account with an example. In the study of the 
relation between genes and phenotypes, an obviously important process is that 


5 The normal form of solution offered here is a simplified version of the underlying explanatory 
schema. For a general account of explanatory schemata, see Kitcher [1981], and for articulated 
examples in particular cases, including the case of classical genetics, see Kitcher [forthcoming]. 


Sylvia Culp and Philip Kitcher 


466 


'Suopsan() Boog Jessop jo AYLI Adus eu T JUNI 


€04ZISAHLNAS ¿a3LVILINI 
ONVHIS VNC. MIN DNIANIMNN 
IHL SI MOH SI MOH 


Ta oe 


siseyjuAS WNC pue spueas jo 
Buipuimun seA[oAu! BuiÁdoo WNq 


¿NIJLOYd 
TWNOILLONNA 
TIVNOISN3WIG-33HHl 
Y OLNI Q103 WHOM AYOM ¢dald09 YNG SI MOH 
JONANDAS AuVWIHd  NOILVISNYYL NOLLdIHOSNVH.L nS 
Buipjoy pue 'uone|suer ed31VHVd3iS 
'uonduosuea saA[OAul uOnBgul0j uia104g QNV Q3ulvd 
HOM S3NOSONWOHHO eQ3ildO2 
4S3N39 WOH4 Q3ZIS3HINAS SISOI3IN S'109010NOH S3N39 
pt di MAOH bile a 
Sui910Jd 10) pos seuaey i1ueuniosse pue uonjeBaiBes 


'BuiÁdoo SeAjOAU! UOISSIUISUPA | 


eSWSINVOYO JYNLVW 
aAWO94d S3109AZ OG MOH éOaLLINSNVYL S3N39 JYV MOH 


juewdojeneg TUE ee soneues 


«4401220 d SIOA MOH 


467 


Theory Structure/ Change in Contemporary Molecular Biology 


'Áqoye1ord pepuedxq our Z FANS A 


¿1139 V 40 NOLLOVHLNOO 
AHL OL 31n81H1NOO 
NISOAW QNV NILOV Od MOH 


| 


ulsoAw pug unoe 
UIB1UOO S|J02 9|9SnN 


ELOVHLNOD 
S3198nIN Od MOH 


| 


sejosnw 

jo uoisuajxe 
pue uonoeguoo2 
seunbes uonolw 


AOW iHlvaug 
SWSINVDYO  SWSINVOHO 
oq E 7 MOH 
ABojoig 1129 


Y 


4S3N39 NISOAIN ANY 
NILOV ONISIHOSNVHI 
H04 WSINVHO3W 
AHL 31VILINI 1VH.L 
3AI393H LSVIGOAW 
DNI VILN3H3HHIG V 
JHL SI 1IVHM S30G S'IVNOSIS LVHM 


LUN d 


4S 1130 FIOSNW 
NI Q3TIOHLNOO 
NISOAW QNV NILOV dO 
NOILOnIGOHd AHL SI MOH 


pz 


uisoAui pueg ulog 
uij81uOo Sj[62 9josn|jA 


4S3N39 
NISOAW ONY NILOV 
JHL ONILVALLOV 
HOd INSINVHO3IN 


éWHO4 éWHOS 
SALANVD SJIOSAW 
OQ MOH Od MOH 
Ig 7* CL eunBr4 ur sy] 
'Ule1sÁs snoAieu ‘BEY 
'siuu4ep 'sojouieD | 


'sa|»snui 048 peuu10j sulajoid 104 
seinjonys soley apoo seuac) 


p 


4SINSINVOHO JHYNLYW 
3WNO0238 S3109AZ OG MOH 


P d 


1ueuudo|[eAeg 


EN ul Sy] 


soneuert) 


481920 d/S30d MOH i 


468 Sylvia Culp and Philip Kitcher 


in which particular genes give rise to proteins: any aspect of development 
obviously involves many—typically very many—episodes in which genes are 
‘read’ and proteins synthesized. A first-level decomposition of any developmen- 
tal process will thus posit numerous processes in which genes ‘encode’ 
proteins. However, each one of these processes is itself regarded as complex: 
the gene is selected for transcription; the gene is transcribed to form a 
messenger RNA; that mRNA is post-transcriptionally modified; the modified 
mRNA is translated, ultimately giving rise to a protein product at the 
ribosomes. From this second-level decomposition, we select the process of 
transcription for more detailed study. How does transcription occur? 

Here are the major features of the story as we currently understand it. There 
are protein molecules, RNA polymerases, that can enter into loose associations 
with double-stranded DNA. Effectively, these molecules can slide up and down 
the DNA, ‘scanning’ it. Ifthe DNA contains the appropriate kind of sequence, a 
promoter, and if the RNA polymerase is in the right configuration, then the 
RNA polymerase becomes bound to the DNA at an appropriate point and the 
DNA becomes unwound. RNA chain synthesis is initiated and elongates along 
one of the exposed strands. 

Now a full molecular biological account would be able to eliminate some of 
the vague and metaphorical terminology that we have employed in the 
previous paragraph. That account would be able to specify in molecular detail 
the four phases that are discerned in our story: loose association, binding, 
unwinding, and chain-synthesis. The task is to identify the chemtcal structure 
of ‘loose association’, to specify the ‘appropriate sites’ and to show how, at 
these sites, the configuration of RNA polymerases is changed, thence to specify 
the altered configuration of the DNA and show how it results, and, finally, to 
give the molecular details of chain-elongation along the exposed strand. Parts 
of the puzzle are in place for particular gene complexes in particular 
organisms—but the past decade has witnessed a number of discoveries of 
significant differences among different cases. In terms of our normal form, we 
have a working hypothesis about the decomposition of the process of 
transcription and some details of the subprocesses for particular cases of 
transcription. It is still too early to tell how much generality we may eventually 
attain. 

We hope that it is now possible to see how the apparently amorphous 
literature in molecular biology is organized. The mass of details in the books, 
monographs, and research articles, that comprises the set of accepted 
statements in the current practice of molecular biology, is structured by the 
hierarchy of questions that the discipline addresses, the hierarchy displayed in 
Figures 1 and 2. Each ofthe statements obtains its significance from the part it 
plays in answering one of the questions in the hierarchy, and, we believe, that 
role can best be understood by seeing how the statement contributes to an 
answer in normal form (the pattern delineated above, p. 7). 


Theory Structure/Change in Contemporary Molecular Biology 469 


Two further aspects of the practice of molecular biology deserve comment 
before we offer a more extended illustrative example. First, the language of 
contemporary molecular biology is a curious hybrid. For many of the 
expressions employed by molecular biologists, the reference is fixed, parasiti- 
cally, in the fashion of organic chemistry: terms like ‘bond energy’ and ‘peptide 
chain’ retain their standard chemical significance in the biological context. 
Where terms are taken over from biology or are employed in ways specific to 
the biological material, there may, however, be a re-fixing of reference. We 
shall see this in more detail in the next Section, when we consider the 
modification of the concept of enzyme. Finally, molecular biology uses 
metaphorical language in describing important subprocesses: polymerases 
‘scan’ the DNA, introns are ‘excised’, and so forth. The assumption seems to be 
that, ultimately, it will be possible to say in a literal, biochemical language, 
what Is now expressed in the helpful metaphors. As we shall suggest below, 
there are important questions that lurk here. 

Finally, no account of molecular biology should overlook the importance of 
the experimental techniques. Philosophical views of science (with a few 
exceptions—e.g. Hacking [1983]) frequently distort the character of the 
enterprise by stressing ‘grand theory’. Since contemporary molecular biology 
is founded on methods for arriving at the structure of complex molecules— 
nucleic acids and proteins—without which there would be no chance of 
providing the structural descriptions needed for the normal form of solution to 
molecular biological problems, it is impossible simply to regard this component 
of the practice as an afterthought. Those methods are, of course, intertwined 
with the large presuppositions that, in our view, help to structure the field. If it 
were discovered that small fragments of nucleic acids that convey genetic 
information were tightly associated in some cells with genomic DNA, then the 
techniques for mapping and sequencing genomes that are based on the 
presupposition that all genetic information is linearly arranged on nucleic acid 
molecules would have to be radically revised. In the next section, we shall 
consider a real instance that resembles this hypothetical discovery, and 
attempt to show both the significance of experimental technique and how 
changes in large presuppositions about what experiments show can be 
accommodated. 


3 THE DISCOVERY OF ENZYMATIC RNA 


We have already noted that, for biological reactions to occur under 
physiological conditions, they require an enzyme to act as a catalyst. Thus, 
when any instance of the big vague question has been decomposed and we are 
addressing the normal form of some component question, it is always 
significant to inquire after the enzyme involved. 


470 Sylvia Culp and Philip Kitcher 


Attention to the history of the study of enzymes helps us understand how the 
issue is focused, and what presuppositions are involved in addressing it. In the 
absence of catalysts, biological reactions would only occur very slowly. 
Enzymes accelerate those reactions from 106 to 10?? times without themselves 
being consumed. Enzymes are also highly specific; often an enzyme will 
catalyze only a single reaction, or, at most, a set of closely related reactions. 
Since Sumner's first isolation of an enzyme, urease, in 1926, hundreds of 
enzymes have been purified and shown to catalyze speciflc biological reactions, 
and the total range of reactions catalyzed is now extensive. Like urease, all the 
enzymes that had been isolated until very recently were proteins, so that it was 
natural for biologists to assume that all enzymes are proteins, to approach the 
problem of finding the enzyme by looking for a protein, and even to define 
enzymes as proteins (Westheimer [1986]).5 

In [1986], Zaug and Cech reported in Science that a segment of RNA could 
act as an RNA synthesizing enzyme. Within two weeks, their discovery was 
hailed as 'revolutionary' by Westheimer in the 'News and Views' section of 
Nature (Zaug and Cech [1986], Westheimer [1986]). Because all enzymes 
were assumed to be proteins, it is surprising that Zaug and Cech designed 
experiments to look for an enzyme activity mediated by an RNA molecule. 
However, Cech and his colleagues were first led to challenge the presupposi- 
tion that all enzymes are proteins by observations that they published five 
years earlier (Cech et al [1981]). Cech and hts co-workers had been studying 
ribosomal RNA (rRNA) splicing in the ciliated protozoan Tetrahymena 
thermophilia. In the course of their study they designed experiments for 
identifying and isolating the protein enzymes that catalyzed the rRNA 
splicing—a part of an instance of a normal form problem. As the pattern 
displayed above (p. 465) reveals, one subtask is to identify the enzymes 
involved in the constituent subprocesses of the biologically significant process 
under investigation, and it was precisely this subtask that Zaug and Cech 
addressed. After purifying the RNA precursor (pre-RNA) from T. thermophilia 
and mixing it with nuclear extracts from the same cells, they were able to 
produce spliced rRNA in a test tube. However, a negative control, in which the 
nuclear extract was omitted, also gave rise to spliced rRNA. Assuming that 
more stringent purifying methods were needed, they tried several methods to 
remove the enzymatic activity by destroying proteins that they supposed to be 
associated with the pre-rRNA. None of their procedures were able to destroy 
the RNA splicing activity. Thus they were led to conclude that the pre-RNA 
might be an enzyme. 

However, Cech and his colleagues [1981] buried this embarrassing fact in 


* This assumption was less arbitrary than tt may sound. It was supported by the recognition that 
proteins, built up out of twenty amino acid side chains, would have enough diversity to give the 
required enzyme specificity. 


Theory Structure/Change in Contemporary Molecular Biology 471 


their paper among a large body of data on the splicing mechanism. In both the 
results and discussion sections, they only menttoned briefly the possibility that 
some part of the pre-rRNA might be an RNA enzyme. The model of pre-rRNA 
splicing that they presented at the end of the discussion emphasized the 
structures of the molecules before and after splicing, and did not mention the 
possibility that an RNA enzyme might catalyze the reaction. Their paper was 
published in Cell, a prestigious and widely-read Journal. Yet only readers who 
read this paper carefully realized that Cech and his colleagues might have 
found an RNA enzyme. 

Why was Cech so modest in burying a discovery that would be hailed, tive 
years later, as revolutionary? The answer 1s relatively obvious. First, since 
protein enzymes had been found for every other biological reaction studied, 
there was no reason to expect that an RNA molecule could be an enzyme 
(although, we hasten to note, there is nothing in the underlying biochemistry 
to preclude the possibility that nucleic acids can catalyze reactions). Second, 
because the pre- rRNA had been isolated from T. thermophilia cells, it would still 
be possible to deny the enzymatic RNA interpretation by claiming that a 
protein enzyme had become incorporated with the pre-rRNA and had not been 
inactivated by the various methods tried by Cech and his colleagues. 

If recombinant DNA technology had not been developed four years earlier, 
Cech's data would probably have remained buried or, if it had later been 
presented more forthrightly, would have engendered an unresolved contro- 
versy among molecular biologists. However, Cech's next paper, published 
again in Cell in 1982, was entitled 'Self-splicing RNA' (Kruger et al [1982]). 
Cech and his co-workers used recombinant DNA technology to clone a portion 
ofthe T. thermophilia rRNA gene into an E. coli (bacterial) plasmid vector. After 
purifying the recombinant plasmid DNA from the E. coli cells, they used 
purified E. coli RNA polymerase to make the pre-RNA in vitro. This pre-rRNA, 
separated from all T. thermophilia proteins, could be judged with confidence to 
be nucleic acid and nucleic acid alone.” By showing that it was able to splice 
Itself, Cech and his colleagues demonstrated that a protein enzyme was not 
necessary for pre-RNA splicing. However, because the pre-rRNA modified itself 
during the splicing reaction, it did not meet all the conditions on enzymes. 
Kruger et al [1982] decided to dub it a ‘ribozyme’. 

Between 1982 and 1986, a series of papers on the mechanism and products 
of the ribozyme-mediated RNA-splicing reaction appeared from Cech’s 
laboratory (Zaug et al [1983], Bass and Cech [1984], Zaug et al [1984], Inoue 
et al [1985], Sullivan and Cech [1985], Zaug et al [1985]). Using the 
Information amassed in this period, Cech and his colleagues were eventually 
able to show that a piece of the pre-rRNA, removed and modified during the 


7 The pre-RNA could have been associated with E. coll proteins, but these were assumed to be 
irrelevant because E. coli is a prokaryote and T. thermophilia an eukaryote. 


472 Sylvia Culp and Philip Kitcher 


self-splicing reaction, could act as an enzyme. By 1986, they were prepared to 
conclude that this piece of RNA could act as an RNA polymerase, putting 
together long strands of RNA (Zaug and Cech [1986]) The segment in 
question was shown to meet the three conditions for an enzyme: it could 
mediate reactions that could not proceed in its absence, it was reaction- 
specific, and it was not modified by these reactions. 

Discovering that enzymes can also be made of RNA may seem a small 
change in enzyme theory, merely an addition of RNA enzymes to the long list 
of protein enzymes. Alternatively, one might defend the ‘revolutionary’ claim 
of Westheimer's review, by maintaining that the concept of enzyme has been 
refashioned and that a central generalization of molecular biology —the thesis 
that all enzymes are proteins—has been abandoned. In our judgment, neither 
of these responses to the incident captures what has been going on. In the 
following Section we shall try to show how developments so far achieved 
might appear in retrospect either as minor modifications of molecular biology 
or as revolutionary. 

We conclude our discussion of the case by noting two ways in which it has 
reshaped current research. Past experiments (directed at the normal form 
question of finding the enzyme) had been designed to look for proteins. Where 
such experiments had been inconclusive, it is possible that re-investigation will 
disclose the activity of enzymatic nuclelc acids—perhaps even possible that 
techniques of purification that have been jettisoned as problematic may be 
resurrected. In particular, the research on ribosome function may need to be 
reevaluated (Moore [1988]). In every biological system, ribosomes are used to 
translate protein from mRNA. Ribosomes contain both RNA and protein. Until 
now, the enzymatic activities associated with ribosomes had been assigned to 
the ribosomal proteins. Cech’s work raises the possibility that ribosomal RNAs 
have enzymatic activities. In addition, other cellular processes, such as 
regulation of RNA transcription and processing, may need to be reevaluated 
because they could also be medtated by RNA enzymes. 

A second area in which research has been affected is the field of prebiotic 
evolution. Only three weeks after the publication of Zaug and Cech [1986], 
Gilbert proposed in the ‘News and Views’ section of Nature that, at the earliest 
stages of evolution, the world was an RNA world (Gilbert [1986]). Since then, 
renewed interest in prebiotic evolution has led to the publication of several 
speculative papers (Cech [1986], Joyce et al [1987], Weiner and Maizels 
[1987]) and to the organization of a major meeting on the evolution of 
catalytic function, the annual Cold Spring Harbor Symposium on Quantitative 
Biology. The renewal of interest has obviously been sparked by appreciation of 
the fact that an RNA-synthesizing enzyme composed of RNA makes it possible 
that a molecule could catalyze synthesis of itself. The hypothesis that the same 
molecule acts as a synthetase and serves as the template overcomes the 
chicken-and-egg problem posed by the conventional theoretical requirement 


Theory Structure/Change in Contemporary Molecular Biology 473 


that two separate (but interdependent) molecules are needed for replicatton— 
synthetases made of protein and templates made of RNA. 


4 UNDERSTANDING THE CHANGE 


In a sense the action is already over. After years of research, Cech and his co- 
workers found a segment of RNA capable of performing enzymatic functions. 
The discovery was accepted by his colleagues, and, indeed, canonized through 
the incorporation of a section on enzymatic RNA in the latest edition of the 
most celebrated textbook in the field (Watson et al [1987]).* While it is possible 
that the discovery of enzymatic RNA will live up to Westheimer’s billing of it as 
‘revolutionary’, it seems to us equally possible that the molecular biology of the 
twenty-first century will treat Cech’s discovery as an isolated curiosity, 
relegating it to footnotes of the genre: ‘Readers should be aware that, although 
almost all enzymes are proteins, there are exceptional cases in which nucleic 
acids can perform certain elementary enzymatic functions. The first such case 
discovered emerged from the work of Cech and others on rRNA splicing in 
Tetrahymena thermophilia. . . .’ Our aim in this section is to analyze the case of 
the discovery of enzymatic RNA from the perspective outlined in Section 2. We 
shall use our reconstruction both to show how that discovery might lead to a 
revolution in molecular biclogy, and to draw some general morals about 
scientific change. 

Consider first what has occurred so far. The practice of contemporary 
molecular biology has been affected in the following ways. (1) The referent of 
‘enzyme’ can no longer be fixed by the description ‘A protein that mediates a 
specific biological reaction (or a set of closely related biological reactions) and 
that is unmodified at the end of the reaction(s)’. (2) The statement ‘All enzymes 
are proteins’ is no longer accepted; the statement ‘A segment of T. thermophila 
pre-rRNA can catalyze RNA polymerization’ is accepted. (3) Normal form 
solutions to normal form questions no longer presuppose that the identifica- 
tion of enzymes at step 2 will identify proteins. (4) It is no longer necessary to 
contend that an experimental technique for purifying nuclear extracts by 
eliminating protein is suspect if its application permits enzymatic activity. At 
least four components of the practice—language, statements, schematic 
answers, and experimental techniques—have all felt the impact of Cech's 
discovery. Perhaps we could also add a fifth, by noting the addition of a new 


5 Even if RNA enzymes are not widespread In the present world, that does not mean that they were 
not initially prominent. Perhaps enzymatic RNA was relatively primitive, and, once the system 
got advanced enough to make proteins, those organisms (Dawkinstan replicators?) that stayed 
with enzymatic RNA were at a disadvantage except with respect to very special processes like 
rRNA splicing. 


* As this paper was in press Cech and his colleagues were awarded the Nobel prize in Chemistry for 
their work on enzymatic RNA. 


474 Sylvia Culp and Philip Kitcher 


theoretical question: What kinds of reactions can we expect nucleic acids to 
catalyze? 

How were these changes accomplished? Let’s start with the state of 
molecular biology in 1981. Cech and his co-workers attack a problem that the 
molecular biology of that time counts as significant, the identification of the 
enzyme catalyzing pre-rRNA splicing in T. thermophilia (note that the 
significance of the protozoan in this case stems simply from the fact that this 
was an organism in which the process could be studied using the technology 
available at the time). Given the assumption that all enzymes are proteins, they 
expect to find a splicing enzyme. The negative control experiment apparently 
fails, leaving them with the options of concluding that the splicing is self- 
mediated or that the purifying techniques they have employed have not 
removed all the relevant protein. Appealing to the standard experimental 
techniques of molecular biology, they make further efforts to eliminate the 
unwanted protein. 

At this stage, resolution of the puzzle could still be obtained by suggesting 
that the standard techniques are incapable of dealing with certain highly 
recalcitrant proteins that associate tightly with pre-rRNAs. That possibility is 
eliminated by appealing to the central techniques of recombinant DNA: it is 
taken for granted that application of these techniques will yield pure pre-rRNA.? 
(Alternatively, questioning the applicability of the methods used in cloning the 
T. thermophilia gene and synthesizing its product in vitro, would entail an even 
more massive shift in molecular biology than that provoked by the claim of 
enzymatic RNA. At very least, would-be cloners would have had to question 
the assumption that E. coli proteins are incapable of splicing eukaryotic RNA, 
an assumption fundamental to the developing techniques of cloning and 
sequencing.) Cech and his colleagues conclude that RNA can sometimes 
function like an enzyme: RNAs can mediate reactions on themselves that are 
typically performed by splicing enzymes (proteins) but parts of the RNAs are, of 
course, changed in the process. Notice that this conclusion, and even the more 
ambitious conclusion that RNAs can act as enzymes, can be accepted while 
preserving the underlying biochemistry, for there is nothing in theoretical 
biochemistry that declares that enzymatic (or quasi-enzymatic) functioning of 
nucleic acids is impossible. 

The last phase of the course of experimentation consists in lengthy tinkering 
with the biological system so as to achieve a clean example, not something that 
is ‘very close’ to an enzyme but something that meets all the conditions.!? 


? Here ‘pure’ means ‘free from T. thermophilia proteins’. 

10 It is not clear to us, given the sequence of experiments that Cech and his colleagues actually 
performed, whether the demonstration that RNA could fulfil all conditions on enzymes was an 
explicit goal of the research. Much of the intervening work is concerned with elucidating 
detalls of the ribozyme mechanism, and the decision to conduct the experiment in which they 
showed pure enzymatic RNA may have been taken relatively late. 


Theory Structure/Change in Contemporary Molecular Biology 475 


Here, what Zaug and Cech accomplish is the separation of the active RNA from 
{ts substrate and product. Against the background of accepted experimental 
technique in molecular biology, and the biochemical underpinning of that 
technique, their experiment leaves no option but to scrap the presupposition 
that only proteins can be enzymes. The practice of molecular biology is then 
brought into line by modifying the concept of enzyme—changing the 
description that is standardly used to fix reference—by explicitly allowing a 
new option tn the normal form of problem solution, and by reconsidering the 
merits of experimental techniques. 

All this might be revolutionary. Now that enzymatic RNA has once been 
identified, we may find it turning up everywhere. Indeed, another enzymatic 
RNA, RNase P, was identified by Sidney Altman’s group (Guerrier-Takada et al 
[1983]) after Cech’s discovery of self-splicing by the Tetrahymena pre-rRNA. 
What this means is that the everyday experimentation designed to uncover 
and articulate normal form solutions takes advantage of the fact that a 
constraint formerly imposed on the schema underlying those solutions has 
been removed. Correlatively, the experiments now performed have to be set up 
to control not only for the possible enzymatic effects of proteins, but also for 
potential effects of nucleic acids. In general the change that has already 
occurred in the practice of molecular biclogy modifies the schema underlying 
normal form solutions—the schema displayed in Section 2—by replacing a 
prior constraint (the constraint that the enzymes identified at step 2 be 
proteins) with a more liberal condition (the constraint that the enzymes be 
either proteins or nucleic acids). A recent review article by Moore [1988] 
shows that the practice of molecular biology has already absorbed this change. 
What remains indeterminate is the extent to which instantiation of the new 
schema will be different from Instances that would have fitted the old. We 
suggest that the change will only live up to Westheimer’s advertisement if 
there is now a significant number of problem solutions that instantiate the 
modified schema that would not have exemplified the original. This, of course, 
is just a way of capturing the commonsensical point that the discovery of 
enzymatic RNA only proves revolutionary if cases in which nucleic acids act as 
enzymes occur with some reasonable frequency. 

But how significant is ‘significant’, how frequent constitutes a ‘reasonable 
frequency’? Better, at this point to drop the hard-and-fast categorization of 
‘revolution’, and recognize that our judgments about changes in science ought 
to allow for something like a continuum of cases, that differences in degree are 
sometimes large enough to prompt us to talk of a difference in kind, and that 
11 Again, we note explicitly that the frequency with which nucleic acids act as enzymes may vary 

at different periods of the history of lfe. In particular, enzymatic RNA may have been prevalent 
in the primeval soup but proteins may have taken over the role of enzymes for almost all 
reactions over the next three billion years. Were that to be so, the discovery of enzymatic RNA 


would cause profound changes for origins-of-life research while only altering molecular 
investigations of reactions in contemporary organisms in very minor ways. 


476 Sylvia Culp and Philip Kitcher 


talk of ‘revolutions’ is the product of the prompting. The magnitude of the 
change in molecular biology stemming from the discovery of enzymatic RNA 
will be directly proportional to the frequency with which molecular biologists 
now generate normal form solutions that instantiate the new schema but that 
would not have fitted the old. Where, if anywhere, one wants to say that the 
frequency has become big enough so that the discovery is genuinely 
revolutionary seems to us a matter of little importance. It is enough that our 
way of reconstructing the episode allows us to distinguish scenarios involving 
changes great and small, and that it permits us to see how the magnitude of the 
change is not yet determined. 

We can now venture some general morals about scientific change. The first 
point to notice is that, whether the transformation of molecular biology turns 
out to be relatively large or relatively small, there is no indication that any 
problematic methodological principles will be needed to resolve disputes. As 
we have reconstructed the evidence and the reasoning from it, the past 
practice of molecular biology was a background against which rather simple 
arguments led to the modification of some components in the light of new 
evidence. With the new practice in place, similar arguments are likely to be 
used to construct particular problem-solutions from specific experiments, and 
there is no reason to believe that judgments about the frequency of cases in 
which nucleic acids act as enzymes will be any more controversial than the 
initial Judgment made by Zaug and Cech. As we look at the micro-structure of 
a scientific change—whether it turns out to be major or minor—we see how 
large effects can occur from the succession of epistemologically unproblematic 
modifications of practice. Indeed, we believe that it is Uluminating to focus on a 
case in which the eventual magnitude of the transformation is presently 
unknown, for it brings home to us the point that the same kinds of reasons are 
operative both in the small and in the large. As Laudan has argued [1984], we 
tend to make the big shifts in science appear incomprehensible by juxtaposing 
the endpoints and omitting the intervening steps. Our study reinforces the 
argument, by showing that the primary shift may leave open the extent of the 
total change, so that the reasons behind it are reasons that can be involved in 
the minutiae of 'normal scientiflc development' or in the excitement of 
'revolutionary change'. 

Our second moral is that an adequate account of the state of a science at a 
time must have some way of assessing the ways in which particular concepts, 
statements or assumptions are involved in the activity of scientific problem- 
solving. The ‘received view’ of theories sees that a statement, ‘All enzymes are 
proteins', has been replaced with another statement, 'Many enzymes are 
proteins, but at least a few enzymes are nucleic acids’, but, from the perspective 
of the 'received view', there's no way of evaluating the involvement of these 
statements in the work of molecular biologists. (Note that neither statement 
seems to be a promising candidate for a law; thus, from the perspective of the 


Theory Structure/Change in Contemporary Molecular Biology 477 


‘received view’, one accidental generalization has given way to another.) By 
emphasizing the structure of normal form problems and normal! form problem 
solutions, we have tried to show how statements function in the scientist's 
vision of the world and so to express how the modification of a statement— 
even of a statement that is not a law—can profoundly affect that vision. 

Here we have deliberately used Kuhnian language, because we think that 
there is something tmportantly right about Kuhn’s much-derided suggestion 
that ‘revolutions are changes in world-view’. However, Kuhn’s way of 
developing the point seems to us to go in the wrong direction. Instead of 
thinking that the change of view is linked to conceptual shifts, associated with 
conceptual incommensurability, we take up ideas from Kuhn’s analysis of 
normal science, and suppose that the change of view is constituted by the use of 
different schemata tn the solution of scientific problems. If Cech’s work does 
have the far-reaching implications prophesied by Westheimer, then molecular 
biologists will see many phenomena differently, and the change in view will be 
revealed in their design and pursuit of experiments. But, we suggest, there will 
be no Gestalt switches, no Kuhnian conceptual incommensurability, no shift 
in methodological standards. For understanding the change, it is necessary to 
take very seriously the idea of problem-solving that Kuhn so insightfully placed 
at the center of normal science. 


5 DERIVATIVE REVOLUTIONS AND REDUCTIONISM REVISITED 


We want to conclude with a brief look at the impact of changes in one field on 
other areas of science, and with some even briefer comments on the timeworn 
topic of reductionism. As we noted at the end of Section 3, the research of Cech 
and his colleagues has sparked new interest in the topic of pre-biotic evolution. 
This is a topic that might be assigned either to evolutionary biology or to 
molecular biology, if the latter were conceived a bit more broadly than in our 
treatment by enriching the set of questions to include functional as well as 
structural/mechanistic questions. For present purposes, we will adopt what 
we will take to be a purely conventional delineation of molecular biology by 
opting for the narrower characterization of its problems that we offered above, 
and, in consequence, we will suppose that issues of present function (‘What 
selection pressure, if any, maintains this molecular structure?') or of original 
function (‘Under what selection pressure, if any, did this molecular structure 
evolve?’) are the province of a subfield of evolutionary biology. Our first task is 
to show how Cech's discovery might cause a derivative revolution in this 
subfield. 

The current practice of evolutionary theory contains a number of state- 
ments that jointly generate a major theoretical problem. Organisms evolved 
from primitive self-replicating systems. If these self-replicating systems were 
similar to contemporary primitive organisms, then they contained both 


478 Sylvia Culp and Philip Kitcher 


nucleic acids and proteins. Moreover, both nucleic acids and proteins are essential 
to the self-replication. Hence the problem ‘How could self-replicating systems 
have appeared in a primitive soup?’ becomes ‘How could both proteins and 
nucleic acids have appeared in a primitive soup?'. 

The last question has seemed so difficult to answer, given the findings of 
typical origins-of-life experiments, that researchers have considered various 
possible exotic precursors (protenoids, clays) and Creationists have gleefully 
exploited the embarrassment to lobby for their own favorite hypothesis (which 
also involves an exotic precursor). Cech’s discovery promises a way of resisting 
the reformulation of the last paragraph. The major theoretical problem of the 
emergence of self-replicating systems can be tackled by showing how it is 
possible to generate RNAs with sufficient enzymatic activity to direct their own 
replication from hypothetical primitive conditions and subsequently showing 
how descendants of such systems (modified through a succession of analyzable 
chemical transformations) could generate proteins. 

Once again, there is a change in normal form of a problem solution. The 
form of solution previously accepted was to specify an environmental 
condition for a primitive earth (subject to constraints imposed by geology, 
astronomy, chemistry, and so forth) and to derive, using principles of physics 
and chemistry, a description of a subsequent state in which there occur nucleic 
acids and proteins in association, so that these molecules can replicate 
themselves; it is presupposed that this state constitutes the first state in which a 
self-replicating system exists. The new form of solution envisaged replaces the 
description of the products with a description of enzymatic RNAs that are 
capable of self-replication (we set on one side the subsequent problem of 
accounting for the emergence of proteins). 

Here the criteria for revolution are relatively clear cut. Cech’s work would 
have revolutionary significance if you can solve the later problem and not the 
former. Once again, we do not know at this stage whether or not the revolution 
will occur. The outcome is related to but different from the extent of the change 
within molecular biology (conceived in our preferred narrow way). For if 
RNAs are shown to be capable of only a very few enzymatic functions, then itis 
highly unlikely that RNA could catalyze enough reactions to get the RNA 
world going. On the other hand, the frequency of cases in which nucleic acids 
serve as enzymes in the current biosphere is independent of the possibility of 
discovering a solution that meets the new normal form. It does not matter how 
many biological reactions are currently RNA-driven. What is significant is 
whether the right reactions could once have been RNA-mediated. 

The relationship between the change in molecular biology and the field of 
biotic evolution might turn out to resemble one facet of that between early 
molecular genetics and classical genetics. As noted in Kitcher [1984], the 
discovery of the structure of DNA resolved a major question that had seemed 
insoluble from the perspective of classical genetics—How do mutant alleles, 


Theory Structure/Change in Contemporary Molecular Biology 479 


that cannot function normally to produce physiologically required products, 
manage to function to replicate themselves? Understanding the structure of 
DNA allowed for an account of mutation and replication that resolved the 
mystery. 

We close by switching our attention from diachronic to synchronic 
relationships. The account of theory structure in molecular biology that we 
have offered develops the idea, advanced in Kitcher [1984], that the 
intertheoretic relationships that philosophers have often tried to describe in 
terms of notions of reduction are best reconceived in terms of the embedding of 
the problem-solving schemata of one field of science in those of another. So, for 
example, molecular genetics furnishes an explanatory extension of classical 
genetics by opening the black box of the relationship between genotype and 
phenotype. The normal form of a problem-solution in classical genetics takes 
the association of allelic combinations with phenotypic traits for granted. 
Molecular biology tries to replace the assertion of association with a derivation 
of a sequence of reactions (see Kitcher [1984] section V, and, for more detail, 
Kitcher [forthcoming]). Other parts of the classical problem solution survive 
unmodified. There is, for example, no molecular replacement for the classical 
understanding of meiotic segregation. 

Anti-reductionists typically want to emphasize the important fact that there 
are patterns revealed by the nonmolecular theory that cannot be captured in 
molecular language. Thus, from the classical point of view, the essence of 
meiosis is that it is a process of pairing and segregation that can, in principle, be 
realized in a motley of molecular mechanisms. Similarly, the goal of molecular 
biology in charting the relationship between genotype and phenotype may not 
succeed because the classical understanding of a developmental process may 
involve properties that are not expressible in the language of molecular 
biology. Geometric relations among tissues that correspond to no molecularly 
speciflable property may be critical to the development of some structures in 
some organisms (Oster and Alberch [1982]). 

We want to end by reconsidering this issue in the light of our current 
understanding of molecular biology. Anti-reductionists suppose that certain 
kinds of properties ('emergent' properties, or ‘higher-level’ properties) can not 
be expressed in the language of molecular biology, and their complaint can be 
formulated in ways that make it completely unmysterious. (There are no 
vitalist spooks lurking in the background.) But, we should ask, what is this 
language of molecular biology that is supposed to be inadequate? Is it the 
language that molecular biologists actually use in devising their problem 
solutions—the language used, for example, in giving a normal form account of 
transcription? If so, then, despite the fact that no specification of some of the 
most plausible candidates for emergent properties (meiotic division, geometri- 
cal relations among tissues) has yet been given, it is not so obvious that it could 
not be given. The anti-reductionist point is much stronger if one thinks of the 


480 Sylvia Culp and Philip Kitcher 


language of molecular biology as ultimately confined to describing the world 
in terms of the situations, internal states and interrelations among individual 
molecules—specifications of shell-filling and geometrical relationships at the 
molecular level, possibly translated in terms of (hypothetical!) Schrodinger 
state-functions. If that is what the language of molecular biology is like, then it 
does appear very likely that molecular biologists will not be able to specify all 
the properties that the developmental biologist adduces in solving classtcal 
developmental problems. 

But there is no reason to limit molecular biology to so austere a language. If 
reductionists were prepared to accept the limit, then we could even claim that 
molecular biology, as currently practiced, is irreducible to molecular biology. 
Consider the normal form account of transcription. It is full of talk of ‘loose 
association’, ‘site recognition’, and ‘unwinding’. We do not believe that there 
is any austerely speciflable property that will subsume all and only the 
molecular structures covered by any one of these frequently used terms. Our 
moral is that the language of molecular biology is opportunistic. It expands to 
specify the properties that are needed for giving normal form explanations. 
Thus, as developmentalists build a powerful case for saying that certain 
properties are emergent, we should expect the language of molecular biology 
to accommodate. 

So who wins? Everybody. No sensible reductionist should ever have thought 
of the language of molecular biology as frozen to the austere idiom—or if the 
thought appeared, it was surely belied by practice. Moreover, it is only a brief 
step to the acknowledgement that change in molecular biology involves 
interaction with classical fields of biology, the point, of course, on which all 
sensible anti-reductionists have wanted to insist. Our perspective on theory 
structure and theory change is intended to make plausible the idea that 
molecular biology is directed at problems furnished by parts of non-molecular 
biology, and that its problem-solutions integrate in interesting ways with the 
problem-solutions developed autonomously by these other disciplines. The 
final wrinkle is that the language of molecular biology itself can be expected to 
develop through interaction with such autonomous problem-solutions. Once 
the achievements of the various disciplines are exhibited and compared—as 
we have tried to do in a preliminary way here—it seems to us that light dawns 
and heat dissipates. 


Department of Immunology, 
Research Foundation of Scripps Clinic, La Jolla 


Department of Philosophy, 
University of California, 
San Diego 


Theory Structure/Change in Contemporary Molecular Biology 481 


REFERENCES 


BALZER, WOLFGANG AND DAWE, C. M. [1986]: ‘Structure and Comparison of Genetic 
Theories [(1) Classical Genetics, and (2) The Reduction of Character-Factor 
Genetics to Molecular Genetics]', British Journal for the Philosophy of Sctence, 37, pp. 
55-69, 177-91. 

Bass, B. L. AND CECH, T. R. [1984]: ‘Specific Interaction between the Self-Splicing RNA 
of Tetrahymena and its Guanosine Substrate: Implications for Biological Catalysis 
by RNA', Nature, 308, pp. 820-26. 

BEATTY, JOHN [1980]: ‘What’s Wrong with the Received View of Evolutionary Theory?’, 
in P. Asquith and R. Gtere (eds.): PSA 1980. East Lansing: Philosophy of Science 
Association. 

BROMBERGER, SYLVAIN [1963]: ‘A Theory about the Theory of Theory and about the 
Theory of Theories’, in W. L. Reese (ed.): Philosophy of Science, The Delaware 
Seminar. New York: Wiley. 

CzcH, T. R. [1986]: ‘A Model for the RNA-Catalyzed Replication of RNA’, Proceedings of 
the National Academy of Sciences (USA), 83, pp. 4360-63. 

Cecu, T. R., ZAUG, A. J. AND GRABOWSKI, P. J. [1981]: ‘In Vitro Splicing of the Ribosomal 
RNA Precursor of Tetrahymena: Involvement of a Guanosine Nucleotide tn the 
Excision of the Intervening Sequence’, Cell, 27, pp. 487-96. 

DARNELL, J., LopisH, H. AND BALTIMORE, D. [1986]: Molecular Cell Biology. New York: 
Scientific American Books. 

Gre, RONALD [1979]: Understanding Scientific Reasoning. New York: Holt, Rinehart 
and Winston. 

GILBERT, WALTER [1986]: ‘The RNA World’, Nature, 319, pp. 618. 

GUERRIER-TAKADA, C., GARDINER, K., MARSH, T., PACE, N. AND ALTMAN, S. [1983]: ‘The 
RNA Moiety of Ribonuclease P is the Catalytic Subunit of the Enzyme’, Cell, 35, pp. 
849-57. 

HACKING, IAN [1983]: Representing and Intervening. Cambridge: Cambridge University 
Press. 

INOUE, T., SULLIVAN, F. X. AND CECH, T. R. [1985]: ‘Intermolecular Exon Ligation of the 
rRNA Precursor of Tetrahymena: Oligonucleotides can Function as 5' Exons', Cell, 
43, pp. 431-7. 

Joyce, G. F., SCHWARTZ, A. W., MILLER, S. L. AND ORGEL, L. E. [1987]: ‘The Case for an 
Ancestral Genetic System Involving Simple Analogues of the Nucleotides’, 
Proceedings of the National Academy of Sciences (USA), 84, pp. 4398-402. 

Kircuer, Paue [1981]: ‘Explanatory Unification’, Philosophy of Science, 48, 507-31. 

KircHER, PHILP [1982]: Abusing Science: The Case Against Creationism. Cambridge MA: 
MIT Press. 

KrrcHer, Pue [1983]: The Nature of Mathematical Knowledge. New York: Oxford 
University Press. 

Kitcuer, Pure [1984]: ‘1953 And All That. A Tale of Two Sciences’, Philosophical 
Review, 93, pp. 335-73. 

Kitcuer, Pour [1985a]: ‘Darwin’s Achievement’, in N. Rescher (ed.): Reason and 
Rationality in Natural Science. Washington DC: University Press of America. 
Kircuer, Paie [1985b]: Vaulting Ambition: Sociobiology and the Quest for Human 

Nature. Cambridge MA: MIT Press. 


482 Sylvia Culp and Philip Kitcher 


KircHER, Pour [forthcoming]: ‘Explanatory Unification and the Causal Structure of 
the World’, to appear in Philip Kitcher and Wesley Salmon (eds.): Scientific 
Explanation. Minneapolis: University of Minnesota Press (Minnesota Studies in the 
Philosophy of Science, Volume XIII). 

KRUGER, K., GRABOWSKI, P. J., ZAUG, A. J., SANDS, J., GOTTSCHLING, D. E. AND CECA, T. R. 
[1982]: ‘Self-Splictng RNA: Auto-excision and Autocyclization of the Ribosomal 
RNA Intervening Sequence of Tetrahymena’, Cell, 31, pp. 147-57. 

KuHN, Tuomas [1970]: The Structure of Scientific Revolutions. Chicago: University of 
Chicago Press. 

LAUDAN, LAURENS [1984]: Science and Values. Berkeley: University of California Press. 

Lrg, Remar [unpublished]: Theory Change in Cardiology, Ph.D. Dissertation, University of 
Minnesota, 1987. 

LLOYD, ELISABETH [forthcoming]: The Structure and Confirmation of Evolutionary Theory. 
Westport CT: Greenwood Press. 

LLOYD, ELISABETH [1984]: ‘A Semantic Approach to the Structure of Population 
Genetics’, Philosophy of Science, 51, pp. 242-64. 

Moore, P. B. [1988]: ‘The Ribosome Returns’, Nature, 331, pp. 223-7. 

OSTER, GEORGE AND ALBERCH, PERE [1982]: ‘Evolution and Bifurcation of Developmental 
Programs’, Evolution, 36, pp. 444-59. 

RAILTON, PETER [1981]: ‘Probability, Explanation, and Information’, Synthese, 48, 
233-56. 

SCHAFFNER, KENNETH [forthcoming]: The Structure of Biomedical Science. 

SHAPERE, DUDLEY [1974]: ‘Scientific Theories and their Domains’, in Frederick Suppe 
(ed.) The Structure of Scientific Theories, Urbana: University of Illinois Press. 

SULLIVAN, F. X. AND CECH, T. R. [1985]: ‘Reversibility of Cyclization of the Tetrahymena 
rRNA Intervening Sequence: Implication for the Mechanism of Splice Site Choice’, 
Cell, 42, pp. 639-48. 

SUPPE, FREDERICK [1972]: ‘What’s Wrong with the Received View on the Structure of 
Scientific Theories?’, Philosophy of Science, 39, pp. 1-19. 

VAN FRAASSEN, Bas [1980]: The Scientific Image. Oxford: Oxford University Press. 

WATSON, J. D., Hopkins, N.H., ROBERTS, J. W., STEITZ, J. A. AND WEINER, A. M. [1987]: 
Molecular Biology of the Gene (fourth edition). Menlo Park: Benjamin Cummins. 

Werner, A. M. AND MAIZELS, N. [1987]: 'tRNA-like Structures Tag the 3’ Ends of 
Genomic RNA Molecules for Replication: Implications for the Origin of Protein 
Synthesis’, Proceedings of the National Academy of Sciences (USA), 84, pp. 7383-87. 

WESTHEIMER, F. H. [1986]: 'Polyribonucleic Acids as Enzymes’, Nature, 319, pp. 534-6. 

WiiaMs, Mary [1970]: ‘Deducing the Consequences of Evolution’, Journal of 
Theoretical Biology, 29, pp. 343-85. 

WILLIAMS, Mary [1973]: ‘Falsifiable Predictions of Evolutionary Theory’, Philosophy of 
Science, 40, pp. 518-37. 

ZAUG, A. J., GRABOWSKI, P. J. AND CECH, T. R. [1983]: ‘Autocatalytic Cyclization of an 
Excised Intervening Sequence RNA is a Cleavage-Ligation Reaction’, Nature, 301, 
pp. 578-83. 

ZAUG, A. J., KENT, J. R. AND CECH, T. R. [1984]: ‘A Labile Phosphodiester Bond at 
the Ligation Junction in a Circular Intervening Sequence RNA’, Science, 224, 
pp. 574-8. 

ZAUG, A. J., KENT, J. R. AND CECH, T. R. [1985]: ‘Reactions of the Intervening Sequence of 


Theory Structure/Change in Contemporary Molecular Biology 483 


the Tetrahymena Ribosomal Ribonucleic Acid Precursor: pH Dependence of 
Cyclization and Site-Specific Hydrolysts’, Biochemistry, 24, pp. 6211-18. 

ZAuG, A.J. AND CECH, T. R. [1986]: ‘The Intervening Sequence RNA of Tetrahymena is an 
Enzyme’, Science, 231, pp. 470-5. 


Brit. J. Phil. Sci. 40 (1989), 485-499 Printed in Great Britain 


Biological Foundations of Prediction 
in an Unpredictable Environment 


M. MARUSIC 





1 Summary 

2 Unpredictability of the Environmental Challenges 

3 Evolutionary Foundation of the Genome 

4 The Principle of General Readiness 

5 Predicting the Unpredictable as a Common Denominator for All Types of 
Recognition 

6 Conclusion 


I SUMMARY 


The recognition of innumerable stimuli from the environment is a fundamen- 
tal property of the living world. It enables every biological unit (cell, organ, 
organism) to function and survive in its ecologic niche. The diversity of stimuli 
that must be perceived, processed and reacted to is extremely great at every 
level of organization and respective recognition. Such a diversity often requires 
tremendously elaborate recognition systems, which appear quite different in 
different organisms and rather unrelated when stimuli of different qualities are 
considered. 

In order to act upon the genome as a selective force, any environmental 
factor must fulfil two requirements: it must be vitally important for the species 
in question and it must also act on the genome long enough to assure the 
‘fixation’ of genes controlling the relevant reaction. In so far as the 
environmental challenges are unpredictable (e.g., antigenicity of viral 
mutants), the genome cannot contain genes for specific recognition and 
subsequent reaction to individual environmental challenges. Instead, the 
organism is endowed with the facility for general readiness of the reaction. This 
readiness is substantiated in a repertoire of receptors with clonal distribution of 
specificities. The repertoire is formed before the encounter with environmental 
stimuli. During its life it confronts stimuli in the form of a web of receptor 
specificities. The web is dense and wide enough to assure that a stimulus 
always finds its receptor counterpart; finally, the reaction appears specific but. TC 


this specificity is neither direct nor genetically determined, but is the end LUCI Q5; DN 





486 M. Marusié 


of the mechanisms substantiated in the general readiness principle. The 
selection pressures act on the genome, albeit not to select and fix specific 
genetic mutants (alleles) but to assure that the relevant pool of genes provides 
a generally sufficient repertoire that will cover most of the spectrum of 
potential challenges. The clonal distribution of receptor specificities (i.e. one 
receptor cell expressing a receptor of one specificity only) represents the 
fundamentals of the general readiness principle. The cells with different 
receptors are differentiated through a process of somatic recombination/ 
activation of the germ-line genes, on the basis of a limited number of genes, 
independently of the stimuli to be encountered. The ubiquity of the principle is 
described in various receptor systems (immunological, sensory, biochemical), 
implying that these systems function on the same principle of general 
readiness, fulfilling the task of predicting the unpredictable, an instrumental 
requirement imposed on everything in the living world. 

By the same token, these apparently biological functions can be directly 
related to the mechanisms of knowledge acquisition. While the inductivists, 
sensualists etc. hold that in knowledge acquisition the special comes before the 
general (see Popper [1984]), the deductivists argue that the general comes 
before the special and that sense organs constitute genetically acquired 
general theories about possible future encounters with the environment 
(Popper & Eccles [1977]; Popper [1984]). We believe that the inductivist 
position is incompatible with the facts of biology that are beginning to become 
unravelled. In addition, this theory, by its delineation and physiologic 
interpretation of genetic organization of receptor systems (i.e. by referring to 
their pools of genes in the genome, and somatic recombination/activation of 
these genes to achieve their clonal distribution Into receptor cells), and the 
proposal of sensory receptors being organized as a repertoire of monospecific, 
different, individual yes-or-no responders, points to the possible directions of 
experimental approach for a better understanding of knowledge acquisition at 
genetic and cellular level. 

This paper is an attempt to identify and define a common denominator for all 
receptor systems, a principle that underlies every type of recognition. The tdea 
will be explained through an analysis of the immune recognition system since 
it offers an excellent example with an array of requirements and restrictions 
imposed upon its functioning. Moreover, it was a question of immunologic 
recognition that prompted the formulation of clonal principle of receptor 
organization (Burnet [1959]) and all those marvellous intellectual develop- 
ments that followed the advent of that concept (see Silverstein [1984]). 

It will be argued that the challenges from the environment cannot be 
foreseen, in principle, and thus the specificity of recognition in an adult 
organism cannot be coded in the germ-line genes. The organism is thus forced 
to develop a general readiness for reaction, and not a receptor for recognition of 
(and subsequent reaction to) a specific challenge, present or potential. Due to 


Biological Prediction of the Unpredictable 487 


that general readiness, the living world is able to react to the challenges that 
did not specifically exist at the time the relevant receptors were constructed or 
even over a time covering the whole evolutionary history of the species. In 
other words, it is able to ‘predict the unpredictable’. 


2 UNPREDICTABILITY OF THE ENVIRONMENTAL CHALLENGES 


Generally, changes in the conditions of the environment cannot be specifically 
predicted. There ts no way for any organism, man included, to foresee which 
new toxins will appear in the environment, which light wavelengths will be 
reflected from various materials and their combinations, and which language 
will be constructed or new pathogenic mutant of a virus produced. However, 
all organisms are exposed to the environment with their survival depending on 
their ability to respond properly, and on time, to all relevant challenges. To 
take an immunological example: a relationship between an organism and its 
pathogens actually consists of a delicate balance between the organism's 
defence mechanisms, and the virulence and antigenicity of pathogens. The 
pathogens ‘have no intention’ of killing their host(s); they simply look for an 
environment in which they can survive (Bloom [1979]). Still, because of the 
adverse effects of their invasion, the host must try to eliminate them. The 
pathogens will be deprived of the environment in which they can survive, if 
they either kill all their hosts or if the host develops absolutely effective defence 
mechanisms against them (Langman [1978]). The equilibrium and survival 
for both partners can only be assured when both are imperfect, i.e. when 
neither the defence nor the virulence-antigenicity completely prevails. The 
relationship results in both partners producing a selective pressure on each 
other: the host 'tries' to protect itself with immune response reportoire which is 
as efficient as possible, and the pathogen tends to escape the host's defence 
through various mechanisms (Bloom [1979]). This tendency of pathogens to 
deceive the host's defence is the principle postulate of this section: the 
pathogens permanently mutate in regard to their antigenic speciflcity, thus 
‘aiming’ at the weak points of host's immunologic defence (Zinkernagel 
[1977]; Bloom [1979]). A mutant less antigenic for the given host (i.e. a host's 
repertoire of immunologic receptors) will survive in it better, and long enough 
to reproduce itself, giving away new mutants. 

Since mutation is an event subjected solely to the rules of probability 
(Kimura [1968]), antigenic characteristics of any potential mutant cannot be 
foreseen. Thus, the host cannot predict the antigens which it will encounter in 
the future, and the relevant receptors cannot be coded within the pool of its 
germ-line genes. It still has to react to those mutants, or it would become 
extinct. And it does actually react, but we can see clearly that the principle of 
that reaction does not necessitate the use of the genes related to specificity of 
past mutants since they are presently unimportant (not acting), of present 


488 M. Marušić 


mutants, since they had no time to influence the germ-line genes, or of future 
mutants, since they are unpredictable (Cohn [1968]). The defence must be 
organized generally, not specifically. The specificity will be introduced later, 
through the advent of the specific ligand (Edelman [1974)]). 


3 EVOLUTIONARY FOUNDATION OF THE GENOME 


Generally, proteins of the body are genetically determined (Schulz and 
Schirmer [1979]) and all genes in the genome are maintained within the given 
structure and organization by selective pressures from the environment; this 
can be inferred directly from the theory of evolution, the sole scientiflc concept 
of the living world (Maynard Smith [1975]). 


3.1 Definition of the Selective Pressure 


If any environmental factor is to be considered a selective pressure for a certain 
genetic trait, it must possess two properties: (a) it must be permanently present 
in the relevant environment; and (b) it must have a vital (survival-related) 
influence on the organism in question (Cohn [1968]). Consequently, one can 
easily deduce that, again, the recognition and response to unpredictable 
environmental stimuli cannot be specifically coded in the germ-line genome. 
The receptors and relevant response mechanisms always include protein 
molecules that must be genetically defined. Since unpredictable (temporary, 
transiently present) environmental stimuli cannot influence the genome long 
enough to produce selective pressure changes, the genes' coding for specific, 
receptor-response proteins, is impossible to maintain in the germ-line. Among 
the genes' coding for immunologic receptors, for example, one cannot have a 
gene for recognition (and subsequent antibody synthesis) of a particular 
infectious agent, regardless of its potential danger for the species (Cohn 
[1968]). Thus we do not have a gene for an antibody recognizing Pasteurella 
pestis antigen(s), a bacterium causing plague—a life-threatening disease. 


3.2 Maintenance of Identity 


Each individual represents a unique product of nature, one that has never 
existed before and will never appear again in nature (Eccles [1979]). The 
uniqueness of the individual ts contained in its identity, a set of properties partly 
shared with other individuals of the same species but diverse enough to provide 
its uniqueness. The total phenotypic identity of an individual is coded in, and 
corresponds to, the identity and uniqueness of its genome (Eccles [1979]). 
Phenotypic identity is provided by numerous apparently independent systems 
such as blood groups, isoenzyme patterns of various enzyme systems, HLA- 
haplotypes, dermatoglyphes, etc. They enable the individual to be different 
enough from other members of the same species and contribute to diversity of 
the species in a way instrumental in coping with the potential survival-related 


Biological Prediction of the Unpredictable 489 


surprises in the environment, and still be similar enough to breed with any 
other member of its species. 

The given composition is fiercely defended and maintained during its life to 
remain in the same basic setting as it was formed through fertilization. Both 
the existence of individual identity, and its elaborate defence, serve the same 
function: survival of the species. The identity has been painstakingly formed, 
through selection, by innumerable selective pressures and is diligently 
maintained because it functions in the given environment; only in the given 
form may it allow the individuals to procreate and assure the survival of the 
species. 

Only the individuals of the same species can breed among themselves, 
breeding depending on the precise species-specific Identity of all its members. 
The identity, therefore, must be maintained at least until the time of sexual 
maturity (Burnet [1976]). Receptor systems of the organism undoubtedly play 
a crucial role in the process: they assure the response to environmental 
challenges of all kinds; from better food choice (taste, vision, smell) through the 
ability to escape (equilibrium, moving) and the recognition of individuality 
threatening pathogens (immune system) (Klein [1982]). 


4 THE PRINCIPLE OF GENERAL READINESS 


As we have seen in the previous paragraphs, an organism cannot directly 
predict an encounter with specific environmental challenges. It does so in an 
indirect, general way. Instead of being able to react specifically, it must be able 
to react generally. This principle of general readiness for reaction to 
environmental stimuli is to a large extent based on clonal distribution of 
receptor specificities. (The concrete receptor systems are discussed in this 
respect in Section 5.) 


4.1 Clonal Distribution of Receptor Speciflcities 


4.1.1 Immunologic Clonal Selection Theory 


In an attempt to explain how the immune system responds specifically to every 
antigen that enters the body, Macfarlane Burnet postulated a clonal selection 
theory (Burnet [1959]). The theory brought him a Nobel prize in 1960, and 
. was a breakthrough in immunology that transformed it from a mere appendix 
of microbiology to a most propulsive biomedical discipline, both intellectually 
and technologically (Silverstein [1984]). Burnet [1959] postulated that every 
lymphocyte expresses only one receptor, with its progeny with identical 
receptors forming a clone. The lymphoid system consists of many such clones, 
where every clone expresses different specificity. The clones differentiate before 
the encounter with antigens, independently of antigen specificity. 

When an antigen enters the body, it flows around until it meets a member of 


490 M. Marušić 


a clone that has a receptor stereospecific for one of its antigenic determinants. 
A receptor-ligand interaction then occurs, stimulating the receptor-bearing 
cell to proliferate and synthesize more such receptors which are subsequently 
released in the body fluids as antigen-specific antibodies (Klein [1982]). 

Burnet's theory was experimentally corroborated many times and applied, 
directly and successfully, to a very important use in production of monoclonal 
antibodies (Kóhler and Milstein [1975]). 


4.1.2 Implications of the Clonal Selection Theory 


Aside from the great technological usefulness of monoclonal antibodies, we 
should stress the crucial theoretical implications of the clonal selection theory: 
(a) the specificity of recognition of the relevant stimuli is clonally distributed 
among receptor-bearing cells; (b) clonal differentiation takes place before and 
independently of the ligands (antigens) in question; and (c) it is the ligand 
specificity that selects the relevant clone and enables it to expand to experimen- 
tally recognizable level (Burnet [1959]). The aim of this paper is to widen the 
described immunologic principle to all body receptors, because they all meet 
the same problem, i.e. unpredictability of the stimuli. Later we shall describe 
how all three basic implications of the clonal selection theory (ie. clonal 
distribution of specificities, stimulus-independent specialization, selection of a 
predetermined receptor by the stimulus) can be recognized in other receptor 
systems; here we are due to comment on the biological importance of the 
concept of clonal distribution of specificities. 


4.1.3 Specificity of the Receptor-bearing Cell 


First, a stimulus cannot 'instruct' the receptor to become specific, since the 
receptor is genetically determined; the stimulus cannot 'instruct' the genes 
either—unless it fulfils the requirements for being a selective factor (see Section 
3.1). Secondly, the receptors must be clonally distributed, so that a single cell 
carries a single receptor specificity, as otherwise the specificity of recognition 
would be lost (Edelman [1974]; Langman [1978]). So it appears that second 
(and other) messengers, transferring the signal received by the membrane 
receptors to the cell, are relatively unspecific (Berridge [1985]). The known 
signal pathways in cells are few in number; in functional terms they share a 
sequence of events, External messengers arriving at receptor molecules in the 
plasma membrane activate a family of transducer molecules, which carry 
signals through the membrane, and amplifier enzymes, which activate 
internal signals carried by second messengers. These pathways include the 
receptor, the amplifiers adenylate cyclase and phospholipase C, G-protein and 
second messengers cAMP, phosphatidyl inositol 4,5-biphosphate and calcium 
lons (Levitzki [1984]). In general, second messengers bind to the regulatory 
component of protein kinase, an enzyme that activates a cellular response 


Biological Prediction of the Unpredictable 491 


predetermined by the developmental genetic mechanisms for the given cell 
(Berridge [1985]). The specificity of response relies solely on the specificity of 
the receptor-ligand interaction, which also obeys a simple 'yes-or-no' logic 
(there is no intelligence In Nature, human intelligence included: all pheno- 
mena can be narrowed down to fundamental laws of physics and chemistry). 

If a receptor-bearing cell carried receptors of two or more specificities, either 
ofthe relevant stimuli would elicit the same reaction, i.e. the one specific for the 
cell in question. The specificity of recognition would be diminished or lost. This 
is well illustrated by the immunologic example: dual or multi-spectficity of 
lymphocytes would lead to production of auto-reactive lymphocytes or to a 
` deletion from the repertoire of all anti-foreign specificities coupled on the same 
lymphocyte with anti-self ones (Langman [1978]). 


4.2 Characteristics of the Receptor Repertoire 


Judging from the studies of immune response, it appears that the receptor 
repertoire has two general characteristics: plasticity and imperfection (Fath- 
man and Fitch [1984]). 


4.2.1 Plasticity of the Repertoire 


Plasticity (degeneracy) of the receptor repertoire means that one receptor can 
be stimulated with two or more similar stimuli (ligands). This does not contest 
our statements from the preceding paragraph inasmuch as that cross-reaction 
is quite understandable biochemically: a receptor-bearing cell is stimulated 
when the receptor-ligand reaction reaches a certain threshold of specificity. The 
stimult which are met with less than liminal specificity are not recognized by 
the given clone; provided the threshold is tuned high enough, a great degree of 
specificity will remain undisturbed. The cross-reactivity (overlapping) con- 
cerns only the signals that are equal to or more specific than the threshold 
level. This means that two very similar antigenic determinants would 
stimulate the same clone, and that two sources of light of similar wavelengths 
would stimulate the same cone in the retina. In a way, it means that the two 
stimuli will not be distinguished, however, for as we can imagine, selective 
pressures push the specificity not to the level of absolute, but of functional 
degree (Cohn [1968]; Edelman [1974]). Overlapping of receptor specificities 
produces a stronger reaction to a stimulus (i.e. triggers more clones); the web of 
receptor specificities is thus made more dense and less stimuli escape the 
recognition and subsequent reaction against them. 


4.2.2 Imperfection of the Repertoire 


A fine analysis of the immunologic receptor repertoire has revealed that the 

repertoire does not perfectly cover the spectrum of necessary specificities: it has 

holes in its texture, and its width is limited (Fathman and Fitch [1984]). 
The holes in the immunologic receptor repertoire correspond to anti-self 


492 M. Marusié 


specificities, since the lymphocytes must be deleted or inactivated towards self- 
antigens in order to avoid auto-immunity (Burnet [1959]); Klein [1982]; 
Fathman and Fitch [1984]). Such phenomena have not been adequately 
investigated in other receptor systems, although they might be expected and 
may prove physiologically interesting and important. (By this, we do not mean 
the pathologic imperfection, e.g., protanopia.) Due to biological limitations, i.e. 
embryologic, evolutionary and thermodynamic principles, the repertoire of 
certain receptor systems does not cover all possible stimuli existing in the 
environment. For example, the system of sound receptors does not cover all the 
sound frequencies possible, and the cone repertoire does not react to all light 
wavelengths (cf. Dudel, 1978). 

Both kinds of imperfection are evolutionarily regulated in such a way that 
the repertoire remains imperfect and restricted, but only to the level where, 
balanced against the strength of the corresponding selective pressure factor, it 
still enables most members of a species to survive until their sexual maturity. 


4.3 Clonal Selection as a Mechanism of General Readiness Principle 


The present analysis has hitherto defined the facts pertinent to the subject of 
this article, Le. a relationship between the genome and selective pressures, 
unpredictability of environmental challenges and clonal distribution of 
receptor specificities. Now we should fuse them together into a coherent 
picture. 

A clonally organized receptor repertoire is confronted with unpredictable 
environmental stimuli; it is wide and dense enough to recognize all relevant 
challenges and enable the organism to react specifically, and in time to assure 
its survival. Relevant selective pressures from the environment with their 
speciflc properties (antigenic determinants, frequencies, wavelengths, etc.) 
interact with one or more receptors from the corresponding repertoire—those 
which happened to fit the given specificity. The reaction thus ends up as being 
specific, although that specificity has not been predicted in its individuality. 
The general quality of the repertoire (continuity of the texture, width and 
density) allows a completely unknown environmental stimulus to ‘find’ its 
counterpart in the body and make the reaction agent-specific. Selection 
pressures do act on the genome, not to select and flx specific genetic mutants 
(alleles) but to assure that the relevant pool of genes provides a receptor 
repertoire which is generally good enough to cover most of the spectrum of 
potential challenges. 

In the immunologic example, the environmental antigens, such as those on 
pathogenic micro-organisms, do not select for specific T-cell receptor genes, 
but act indirectly, i.e. they enforce the polymorphism of genes' coding for major 
histocompatibility antigens, and select the individuals with certain alleles; 
major histocompatibility antigens, in turn, influence the repertoire of T-cell 
receptors and actually determine the quality of repertoire of immunologic 


Biological Prediction of the Unpredictable 493 


reactivity of the organism in question (Klein [1982]). The selection concerns 
certain major histocompatibility alleles, and these genes, in turn, provide a 
general basis for the selection of an immunologic repertoire (Klein [1982]). 
Thus, the. readiness for the response becomes general, and an evolutionary 
challenge-response balance is established on a systemic level, not on the basis 
of one-to-one relationship (Edelman [1974)). It ts this generality of readiness of 
the recognition-response systems only that can cope with the unpredictability 
of environmental challenges. With its genetic endowment, the organism 
biologically ‘assumes’ that its possible responses are relevant to its survival, 
but does not predict in a narrower, individual, specific sense. 


` 4.4 A Note on Genetic Mechanisms of Generation of Diversity 


The receptors of the immunologic repertoire are generated on the basis of a 
pool of several hundreds of germ-line genes that are somatically rearranged 
and/or modified by somatic mutations during the differentiation of lympho- 
cytes (Alt et al. [1986]). Therefore a relatively small number of genes provides a 
great variety of receptor proteins—sufficient to assure the recognition and 
reaction to all dangerous environmental pathogens (Fathman and Fitch 
[1984]). 

Stmilar mechanisms for other receptor systems have not been described, as 
the relevant genetic systems are only now being adequately studied (Nathans 
et al. [1986]). Nevertheless, we should start asking ourselves about the genetic 
(and epigenetic) mechanisms which render the pool of cones in the retina 
responsive to quite a range of different wavelengths, or the cells in Corti organ 
reacting to different frequencies of sound. How many genes are there in the 
germ-line gene pool that code for wavelength-specificity of retinal cones? 
Probably not as many as individual cone specificities existing in a mature 
retina, but certainly numerous enough and/or subjected to sufficiently 
efficlent transcription and post-transcription modification mechanisms to 
assure generation of all cones endowed with all known sensitivities. In other 
words, only genetic and epigenetic mechanisms similar to those already 
discovered for lymphocytes can presently be envisioned for all other multi- 
specific receptor systems. 


5 PREDICTING THE UNPREDICTABLE AS A COMMON DENOMINATOR 
FOR ALL TYPES OF RECOGNITION 


Basic facets of the postulated principle of general readiness that assures 
reaction to unpredictable environmental stimuli have been outlined in the 
previous sections. Now we shall describe in more detail how the principle can 
be applied to several major receptor systems in the body. 

Immunologic recognition was used throughout the paper to illustrate the 


494 M. Marusié 


major points of the analysis, and thus the details of its functioning (although 
perhaps the best investigated example) will not be further reiterated. 


5.1 Brain and Language 


Whereas the clonal organization of the central nervous system can, at present, 
be only suspected, its capability of reaction to unpredictable stimuli is already 
apparent (Cohn [1968]; Eccles [1979]). 

Clonal distribution of specificities among the neurons can be inferred from 
already recognized specialization of neurons, neuronal layers and columns for 
different components of the afferent information, i.e. from different excitation 
patterns of different neurons upon contact with the same afferent stimulus (cf. 
Kandel [1985]). 

The readiness of the brain to react to unpredictable stimuli can easily be 
conjectured from the analysis of any of its intellectual functions, perhaps best 
in language acquisition (Cohn [1968]; Eccles [1979]). Every man can learn 
any language, either anclent ones or those invented today (esperanto, 
computer languages). Although all of us know one language best, i.e. that 
which we proudly call our mother tongue, its foundation lays solely on the fact 
that our mother had taught us just this language. A Croatian child would 
consider English its mother tongue if brought up in England, in an English 
family; the same applies to an English child permanently exposed to Croatian 
language. Obviously, the inheritance has nothing to do with language 
acquisition: the brain simply offers a possibility to learn a language—a system 
of stimuli in combinations completely unpredictable at the time of birth of an 
individual. 

The cellular and molecular basis of that specific general readiness is 
inadequately understood at present (Eccles [1979]). 


5.2 Vision 


Recognition of colours offers an excellent example of a receptor system with 
clonal distribution of specificitles and an ability to react to unpredictable 
stimuli: individual retinal cones are excited by light of a certain range of 
wavelengths only (Nunn et al. [1984]). As a system, all cones together cover 
wavelengths from approximately 400 nm to 700 nm. Light of a given range of 
wavelengths stimulates various proportions of cones of different specificities; 
integrated information on those proportions is interpreted in the brain as the 
sense of a certain colour (Eccles [1979]; cf. Gouras [1985]). 

The variety of colours which will reach the retina ts unpredictable tn as 
much as any combination of different wavelengths can be produced in the 
environment. The importance of colour recognition probably concerns several 
important survival-related functions such as food selection, predator and prey 
recognition, etc. The range of wavelengths covered by a receptor repertoire is 
determined by the particular living conditions of a given species in its ecologic 


Biological Prediction of the Unpredictable 495 


niche. Again, the relevant selective pressure forces a genetic development of 
repertoire receptors covering the range of wavelengths not wider than vitally 
required, and not narrower than evolutionarily unavoidable. Within the 
repertoire, however, the stimuli are predicted only generally, although the 
information finally gains a high degree of specificity. 


5.3 Sound 


Everything said for vision in the previous paragraph also applies to sound 
recognition. Moreover, the recognition of sound frequency is explained by place 
theory (cf. Klinke [1978]), which is actually a typical concept of clonal selection 
of receptors. The basilar membrane of the cochlea contains small filaments 
with different resonance frequencies continuously distributed from oval 
window to helicotrema according to their declining resonance frequency. A 
certain frequency of sound thus stimulates the resonant vibration of a certain 
group of filaments on the basilar membrane; this, in turn, stimulates 
corresponding hair cells in the Corti organ. A precise spatial pattern of 
excitation is further maintained throughout the auditory nerve pathway to 
the parietal cortex and in the cortex itself (cf. Klinke [1978]). Stimulation of 
certain neurons of the cortex is interpreted as a sound of certain frequency (a 
tone). One can see that the clonal principle here (and also in the examples of 
vision, and in other nervous receptor systems) concerns both the receptor and 
neurons in the cortex. This setting assures a specific recording and distinguish- 
ing of senses of various qualities (modalities). It is called the principle of specific 
nerve energies (cf. Guyton [1976]). The mechanisms responsible for precise 
differences in the lengths of filaments in the cochlea basilar membrane are 
unknown but, by any reasoning, they appear to be genetically determined. 
Either the cells producing the filaments are clonally monospecific (ie. a 
particular cell produces a filament of a particular length) or the Jengths of the 
filaments are (also) determined by the distance from the spiral osseous lamina 
to the edge of cochlear bone channel; this, however, must also be genetically 
programmed. 


5.4 Proprioception 


A rather complicated interaction of several receptor systems allows us to 
become aware of the orientation of our limbs with respect to one another, and 
to perceive the movements of our joints (cf. Guyton [1976]). A close look at any 
segment of the receptor systems involved in this kind of perception reveals the 
involvement of the place theory principle mentioned in the previous para- 
graph. The joint capsules contain many proprioreceptors (joint receptors), each 
of them being stimulated when the limb is bent to a certain angle (cf. Guyton 
[1976]). Different joint receptors are sensitive to different angles to which the 
joint is positioned. The information is transformed into a spatially defined 
system of signals and the brain is able to decipher the angle without the visual 


496 M. Marusié 


information (cf. Guyton [1976]). It is an example of clonal distribution of 
specific sensitivity covering precisely the bending limits of the joint. At present, 
the molecular basis of proprioreceptor specialization is not known; it is possible 
that they are specific with regard to their anatomical position only. 


5.5 Metabolism of Chemicals 


This example is important, since it transfers the analysis to a third field (after 
immunology and nerve receptors)—the biochemistry of alimentation and 
detoxification, so documenting further the ubiquity of the general readiness 
principle. 

In the case of the metabolism of chemicals, the requirement of reacting to 
unpredictable stimuli is easy to recognize, but the clonal distribution of 
reaction specificities can only be vaguely suspected. 

It is apparent that the alimentary mechanisms cannot specifically predict 
what kind of chemicals will enter the organism with the food and the 
mechanisms of detoxification cannot directly anticipate the structure of 
chemicals that appear in the environment (many were synthesized during 
several recent decades) and might enter the body through various routes 
(inhalation, alimentation, skin). 

In its (generall) response, the organism develops the ability to perform all 
crucial biochemical reactions sufficiently to accomplish detoxification of all 
common environmental chemicals. The organism meets substrates from the 
environment with a library of biochemical reactions without predicting the 
details of the fine structure of those substrates. This structure can almost 
always be broken down to several typical chemical radicals and bonds that can 
be handled by reactions which hydrolyze, methylate, oxidize, glucuronidate, 
etc. 

Of course, the spectrum of available reactions (enzymes) must match the 
spectrum of basic chemical reactions inherent in the chemicals that a species 
encounters in its everyday life. Natural selection takes care of this by selecting 
the individual variants with appropriate enzyme specificities; in this manner, 
the species balances its abilities with the requirements, or dies out. 

Clonal distribution of reaction specificities in this case would mean that, for 
example, liver cells express one (iso)enzyme per cell and that particular 
interaction of the substrate with the enzyme in/on the cell selects for that 
particular class of cells (Cohn, 1968). Experimental corroboration of such a 
possibility is scarce, but it does exist (Bakemeler [1961]). The question ts 
extremely interesting: does the individual liver cell perform only one, or more, 
functions of the multitude that we recognize at the level of the organ as a 
whole? 


6 CONCLUSION 
A succinct conclusion of this article could be given as a series of mutually 


Biological Prediction of the Unpredictable 497 


interdependent proposals. (1) The changes (challenges, stimuli) of the 
environment are unpredictable. (2) Therefore, no environmental challenge or 
stimulus can specifically act on thé genome as a selective force in as much as it 
rarely, or never, influences the genome to be selected long enough to fix within 
it a gene for the specific reaction to this challenge. (3) However, the organism 
must be able to react to all relevant challenges in order to survive at least to the 
time of sexual maturity. (4) To fulfil this survival-related requirement, during 
evolution the multicellular organisms have built the mechanisms enabling 
them to develop a general readiness for the reaction. (5) The fundamental 
mechanism of this readiness is the generation of a repertoire of receptors with 
clonally distributed specificities. (6) This repertoire is generated before encounter 
and independently of environmental challenges, during the ontogeny and 
maturation of the organism. (7) The generation ofthe diversity of the receptors 
is genetically determined by the pool of relevant germ-line genes and their 
somatic recombination and distribution/activation in the individual func- 
tional receptor cells. (8) The specificity of the reaction to a particular 
challenge/stimulus is achieved through a posteriori, a chance-borne physical 
suitability of a receptor and ligand (stimulus). (9) Selective processes do not 
concern individual germ-line genes directly, but the quality (abundance and 
somatic diversification potential) of their pool, and/or other genes that 
determine the mechanisms which can influence (distort and adapt) the 
randomly generated repertoire of receptors in question. (E.g., HLA-genes, 
through thymic selection of lymphocyte clones by HLA-antigens, finally 
determine particular characteristics of an individual's immune reactivity.) 
(10) Therefore, the ability to react is shaped by evolution, genetically 
programmed and still able to specifically interact with challenges that do not 
have the attributes of selective forces, i.e. the organisms are able to 'predict the 
unpredictable'. 

In general terms, the presented theory corroborates Popper's vlew that the 
general comes before the specific and that sense organs constitute genetically 
acquired theorles about future encounters with the environment. Even more 
generally, it is implied that knowledge acquisition does not materialize 
through the accumulation of information on the specific environmental 
stimuli, but through a stimulation-selection of clonally differentiated mono- 
specific receptors within a repertoire—just as general theory awaits the data 
that fit it. A single key for a letter on a typewriter can do nothing but type that 
letter; by itself it cannot inform us of anything. But, within the repertoire of all 
typewriter keys for individual letters, it can await and set down any language, 
any idea. 

Secondly, and maybe even more importantly, this theory opens new 
possibilities for an experimental approach to the study of learning of any kind. 
With its proposals on receptor gene pools, genetic recombination mechanisms 


498 M. Marušić 


and clonal distribution of receptor specificities, it clearly points to the concrete 
mechanisms that could and should be studied. 


School of Medicine 
University of Zagreb 


REFERENCES 


ALT, F. W., BLACKWELL, T. K., DEPINHO, R. A., RETH, M. G. and YANCOPOULOS, G. D. 
[1986]: Regulation of genome rearrangement events during SRN differen- 
tiation. Immunological Review, 89, pp. 5-30. 

BAKEMEIER, R. F. [1961]: A possible cellular explanation of the multiplicity of steroid 
reductases. Cold Spring Harbour Symposium, Quantity Biology, 26, pp. 379-87. 

BERRIDGE, M. J. [1985]: The molecular basis of communication within the cell. Scientific 
American, 253, pp. 124-36. 

Broom, B. R. [1979]: Games parasites play: how parasites evade immune surveillance. 
Nature, 279, pp. 21-6. 

Burner, F. M. [1959]: The Clonal Selection Theory of Acquired Immunity. Cambridge: 
University Press. 

Burnet, F. M. [1976]: Immunology, Aging and Cancer. San Francisco: Freeman. 

Coun, M. [1968]: The molecular biology of expectation, in O. J. Plescia and W. Braun 
(eds.), Nucleic Acids in Immunology, pp. 671-715. New York: Springer Verlag. 

Dupzr, J. [1978]: General sensory psychology, psychophysics, in R. F. Schmidt (ed.) 
Fundamentals of Sensory Physiology, pp. 1-30. New York: Springer. 

Ecctes, J. C. [1979]: The Human Mystery. New York: Springer International. 

EDELMAN, G. M. [1974]: Origins and mechanisms of specificity in clonal selection. In 
G. M. Edelman (ed.), Cellular Selection and Regulation in Immune Response, pp. 1-38. 
New York: Raven Press. . 

FATHMAN, C. G. and Frrcu, F. W. [1984]: Long-term culture of immunocompetent cells, 
in W. C. Paul (ed.) Fundamental Immunology, pp. 781-95. New York: Raven Press. 

Gouras, P. [1985]: Color vision, tn E. R. Kandel and J. H. Schwartz (eds.) Principles of 
Neural Science, pp. 384-95. New York: Elsevier. 

Guyton, A. C. [1976]: Textbook of Medical Physiology. Philadelphia: Saunders. 

KANDEL, E. R. [1985] Processing of form and movement in the visual system, In E. R. 
Kandel and J. H. Schwartz (eds.), Principles of Neural Science, pp. 366-83. New 
York: Elsevier. 

Kimura, M. [1968]: Evolutionary rate at a molecular level. Nature, 217, pp. 624—6. 

KLEIN, J. [1982]: Immunology: The Science of Self-Nonself Discrimination. New York: 
Wiley. 

KLINKE, R. [1978]: Physiology of hearing, in R. F. Schmidt (ed.) Fundamentals of Sensory 
Physiology, pp. 180—204. New York: Springer. 

KÖHLER, G. and Musrein, C. [1975]: Continuous cultures of fused cells secreting 
antibody of pre-defined specificity. Nature, 256, pp. 495-7. 

LANGMAN, R. E. [1978]: The role of major histocompatibility complex in immunity: A 
new concept in the functioning of a cell-mediated tmmune system. Review of 
Physical Biochemical Pharmacy, 81, pp. 1-37. 


Biological Prediction of the Unpredictable 499 


Levrrzx1, A. [1984]: Receptor to effector coupling in the receptor-dependent adenylate 
cyclase system. Journal of Receptor Research, 4, 399—409. 

MAYNARD-SMITH, J. [1975]: The Theory of Evolution, 3rd ed. New York: Penguin. 

NATHANS, J., PIANTANIDA, T. P., Eppy, R. L., SHows, T. B., HocNzss, D. S. [1986]: 
Molecular genetics of inherlted variation in human color vision. Science, 232, pp. 
203-10. 

Nunn, B. J., SCHNAPF, J. L. and BAYLOR, D. A. [1984]: Spectral sensitivity of single cones 
in the retina of Macaca fascicularis. Nature 309, pp. 264—6. 

Popper, K. R. [1984]: Critical remarks on the knowledge of lower and higher 
organisms, the so-called sensory motor systems. Experimental Brain Research, 
Suppl 9, pp. 19-31. Berlin-Heidelberg: Springer Verlag. 

POPPER, K. R. and Eccrzs, J. C. [1977]: The Self and Its Brain. Berlin-Heidelberg: Springer 
International. 

SILVERSTEIN, A. M. [1984]: The history of immunology, in W. C. Paul (ed.) Fundamental 
Immunology, pp. 23—40. New York: Raven Press. 

ScHULZ, G. E. and ScHIRMER, R. H. [1979]: Principles of Protein Structure. New York: 
Springer. 

ZINKERNAGEL, R. M. [1977]: Role of H-2 gene complex in cell-mediated immunity to 
infectious disease. Transplantation Proceedings, 9, pp. 1835-7. 


Brit. J. Phil. Scl. 40 (1989), 501-518 Printed in Great Britain 


The Many Faces of Irreversibility 


K. G. DENBIGH 


ABSTRACT 


Irreversibility, it is claimed, is a much broader concept than is entropy increase, as 
is shown by the occurrence of certain processes which are trreversible without 
seeming to involve any intrinsic entropy change. These processes include the 
spreading outwards into space of particles, or of radiation, and they also include 
certain biological and mental phenomena. For instance, the irreversible and 
treelike branching which is characteristic of natural evolution is not entropic 
when it is considered in itself —1.e. in abstraction from accompanying biochemical 
and physiological activity. What appears to be the common feature of all forms of 
irreversibility is the fanning out of trajectories, new entities or new states, in the 
temporal direction towards the future. 


1 Introduction 

2 Classes of Irreversible Processes 

3 Irreversibility Defined 

4 T-invariant theories 

5 The Temporal Reference Direction 

6 Irreversibility has no Spatial Analogue 

7 Does Irreversibility Necessarily Involve Entropy? 
8 Fanning Out Towards The Future 


I INTRODUCTION 


In the Timaeus, Plato spoke of time as ‘revolving’ and it may be that he believed 
that ‘time itself’ is cyclic in some sense. Even so he did not suppose that most 
sequences of events are anything other than irreversible. To be sure ‘life cycle’ 
is a commonly used expression; but clearly it does not mean either that the 
events of a person’s life can occur again in the reverse order, or that a person 
can be reborn and then lead exactly the same life over again. Plato’s view, if I 
have understood it correctly, was that there are cyclic motions In the heavens 
but irreversible sequences on Earth. 

So too in Asiatic cultures, even in those which have been widely regarded as 


502 K. G. Denbigh 


having adopted the notton of cyclic time. For example Schipper and Wang 
Hsiu-Hüei [1987] have pointed out that, although Taoist ritual used a concept 
of ‘cycles’ nested within each other, this nesting was in a ‘time’ which, in itself, 
was taken as linear. Similarly in Indian thought; Anindita Balslev [1987] 
recently pointed out that the supposed recurrence of ‘world cycles’ does not 
involve ‘any idea of exact repetition of the particular, and that instead the 
emphasis is on the similarity of the generic features’. 

As we know, Judeo-Christian thought uses the notion of a linear and 
progressive time. But ‘progressive’ in what sense? Science has been widely seen 
as indicating a universe-wide process of ‘running down’, an approach to the 
‘heat death’. However, I have argued elsewhere [Denbigh, 1989] that the 
entropy law is a good deal less restrictive than is commonly supposed. 
Something quite distinct from a running down may also be taking place if we 
can but get a lead on it. 

Towards this end it is useful to give consideration to the concept of 
irreversibility. This has a much wider field of application than has the concept 
of entropy increase, as may be seen from the fact that there exists a wide 
variety of processes which are undoubtedly irreversible whilst seemingly not 
giving rise to any closely correlated entropy change. 

Thus the object of this paper is not at all concerned with the current theories 
either of entropy increase or of ‘chaos’, but rather to consider ‘one way’ 
temporal development in a much broader context, one which will include the 
irreversibility of biological and mental processes. This essay will therefore not 
attempt the mathematical sophistication of Harold Grad's famous paper ‘The 
Many Faces of Entropy' [1961] but instead will be entirely qualitative and 
phenomenological. 

Before proceeding let me repeat the truism that one cannot talk about either 
time or irreversibility without using temporal words. If one were to say, for 
example, that some particular sequence of events does not occur in the reverse 
order, the understanding of ‘sequence’, ‘events’ and ‘occur’ depends on a prior 
acceptance of certain temporal presuppositions which are deeply embedded in 
language. Even the using of the present tense in a 'tenseless' (i.e. timelessly 
true) manner does not always eliminate the presupposing of 'time's arrow'. 
The same applies to the use of many substantives. For instance to speak of 'an 
expansion’ or of a ‘light source’ is tacitly to adopt a particular direction of time 
and not its reverse. The question how that direction is chosen will be deferred 
to Section 5. 


2 CLASSES OF IRREVERSIBLE PROCESSES 


The class which comes first to the thoughts of most scientists is that group of 
processes, occurring 1n physico-chemical systems, which may be called the 
thermodynamic class. Typical examples are the flow of heat from hotter to cooler 


The Many Faces of Irreversibility 503 


bodies, frictional and viscous phenomena, inelastic collisions, processes of 
mixing and diffusion, and the immense number of chemical and nuclear 
reactions. They can all be brought together under the umbrella of the Second 
Law since they are characterized by the non-decrease of entropy, so long as all 
consequential changes in the environment are allowed for as well as the 
changes in the system in question. 

There is also a group of physical processes which are irreversible without 
being entropic—or at least not with any certainty. As long ago as 1932, E. A. 
Milne pointed out (Whitrow [1980], p. 10) that a swarm of noncolliding 
particles tend, in due course, to move further and further apart in space, and 
continue to do so forever, even if initially their vector velocities were such that 
they were moving towards each other. A similar irreversibility manifests itself 
in the case of radiation, for the wave fronts tend to expand rather than to 
contract.! In terms of Maxwell's theory one speaks of using retarded, and not of 
advanced, potentials in the solution of the equations. 

This theme was developed further by Popper [1956, 1957, 1958] in a series 
of short papers where he discussed the example of circular waves moving 
outwards on the surface of a pond, due to a disturbance at its centre; a ciné film 
of this process run backwards would show an entirely 'unphysical' process of 
waves being generated at the pond's periphery and subsequently converging to 
the central point. Popper argued that the normal tendency of the waves to 
move outwards is a non-entropic form of irreversibility. For although it is true 
that the water waves are damped by viscosity, this is an adventitious factor 
since the same phenomenon of an exclusively outwards motion would occur 
in the idealized situation of a non-viscous liquid. It can also be argued that 
Milne’s example of particles becoming more and more separated from each 
other in free space is probably non-entropic. There are two opposing effects. On 
the one hand there is the familiar fact that the adiabatic expansion of a perfect 
gas from one closed volume to another gives rise to an entropy increase in the 
gas which goes up linearly with the logarithm of its volume. On the other 
hand, in the situation where the same gas expands freely into unconfined space 
then, as it becomes more dilute, there occurs a progressive separation of the 
fastest moving molecules from the slowest moving molecules. This separation 
effect would seem to imply a reduction of entropy, thereby reducing, or 
completely cancelling, the entropy increase due to the expansion. Current 
theory does not provide the means for settling the matter, due to uncertainty 
about whether or not a statistical entropy can properly be attributed to 
particles within an unbounded space. 

For further discussion on these forms of irreversibility the reader is referred 


1 Ofcourse under special conditions, e.g. by use of spherical mirrors, one can obtain a contraction 
of wave fronts. Similarly if a large circular hoop were dropped on to the surface of a pond, it 
would result, at least for a short time, in inwards moving waves. 


504 K. G. Denbigh 


to papers by Hill and Griinbaum [1957] and by Penrose and Percival [1962]. 
For my present purposes it is sufficient to notice that there are physical 
phenomena which are irreversible without necessarily being entropic. 

Another very important kind of irreversibility is displayed in biology where it 
is familiar enough that the evolutions of the various spectes of organisms do 
not normally occur in reverse. The feathered birds do not return to being scaly 
reptiles, nor do the reptiles revert to their own parent genera. To be sure there 
are individual instances of regression? or of simplification of function. Simpson 
[1950] pointed out that the notion that evolution is invariably accompanied 
by increase of ‘complexity’ is very difficult to substantiate. What does seem 
certain, however, is that, in the temporal direction we call ‘future’, there 
occurs a branching—what Darwin called a divergence. Throughout the period 
since life first appeared on Earth new species have branched off from existing 
species,’ with the consequence that the overall evolutionary scheme has a 
temporal structure resembling the above-ground structure of a tree. Biologists 
would find it quite unacceptable, I think, to suppose that at some future date 
this structure would start to regress, resulting in all the 'advanced' organisms 
returning to their ancestral states, and leading eventually to all life being 
in the form of unicellular organisms, before these too vanish into a lifeless 
Earth. 

A somewhat similar ‘branching towards the future’ occurs in each 
individual organism. Ontogeny, it was said by Ernest Haeckel, recapitulates 
phylogeny! No doubt this is an oversimplification, but nevertheless it is true, in 
the present context, that the bodily development of each individual traces out 
an irreversible path, just as does the development of the biosphere as a whole. 
The cells of an embryo have the capacity to divide and to differentiate, giving 
rise to a large number of different sorts of cells which go to form the tissues and 
organs of the adult organism; it would appear contra natura to suppose that this 
branching process could ever occur in the opposite direction whereby an adult 
organism would gradually lose the differentiation of its cells and tissues, and 
would eventually revert to a single ovum and a spermatozoon. 

To be sure, the specifically biological kinds of irreversibility are necessarily 
accompanied by the ordinary physiological and biochemical processes of the 
living body.These are processes of fluid flow, of heat transfer and of chemical 
reaction, and, as such, they are entropy producing in the normal way. What I 
have argued is that living things display their own distinctive kinds of 


? Itis known that minor forms of adaptive change, such as the colouring of moths, can be reversed 
if all ancestral environments are retraced. See, for example, Harvey and Partridge [1987]. 

3 The number of species among the animals and plants alone ts now believed to exceed 107. Of 
course it is not to be supposed that the increase in the number of species goes on without 
Interruption. Indeed It appears that, at very big intervals of time, there may occur considerable 
extinctions of specles. However, the fossil record indicates that the 'niches' made vacant are 
quickly filled, and thereafter the normal increase in the number of species is resumed. 


The Many Faces of Irreversibility 505 


irreversibility over and above the entropic kind. The question of independence 
will be taken up in Section 7. 

Of course there was ‘evolution’, in a broad sense, long before there was life. 
The universe itself has evolved and its primitive material has gone into the 
formation of a large number of different kinds of cosmic and stellar objects, 
most notably perhaps a great increase, as the universe cooled, in the number of 
stable molecule types. This chemical evolution [Blum, 1968] was accelerated, as 
soon as life appeared, by a further expansion of molecular variety. As new 
species of plants and animals appeared they synthesized a truly immense 
number of different sorts of organic molecules. The chemical ‘tree’ thus has a 
diverging structure oriented towards the future in much the same way as in 
the case of the biological tree already referred to; the process involved is, of 
course, physico-chemical and not biological. 

In higher animals there is also the irreversibility of mental processes which, 
as experienced in ourselves at least, is a branching of one thought into another, 
a branching into a blooming, buzzing array of new beliefs and intentions, new 
desires and emotions. Many of these mental items remain with us as an 
accumulation in the memory. Perception and cognition thus appear as an 
adding on to what is already in our minds, and not as a subtraction; for once we 
have seen or known something we never undergo the hypothetical reverse 
process of unseeing or unknowing that thing. This point was nicely illustrated 
by Costa de Beauregard [1963, p. 115] when he remarked on the absurdity of 
supposing that, having read some book, we could delete from our minds 
everything said in the book by the act of reading it backwards, from end to 
beginning! 

The branching which occurs in mental activity is often the making of 
connections—it is the putting together of clues to form some new and 
meaningful whole in the mind. As Polanyl in particular has emphasized, once 
the new whole has been grasped its clues take on a different character. A nice 
example is provided by problem pictures such as the one shown in the Tractatus 
where we see an animal's head facing to the left, and then we quickly realize 
that the picture also shows a different animal's head facing to the right. Once 
that double meaning has been grasped we cannot withdraw that understand- 
ing and see the picture as representing one animal only. The cognition is 
irreversible. 

In short it seems ridiculous to suppose that mental processes could possibly 
occur in reverse. We would find ourselves losing wisdom and experience; also 
action would precede intention, thus (it would appear) making nonsense of 
morality! 

So much for this distinguishing of several classes. What is very remarkable is 
that they are entirely consistent with each other as regards the temporal 
ordering of events. There is but the one ‘arrow of time’! For example, if a 
number of events are placed in an order of ‘before’ and ‘after’ by the criterion 


506 K. G. Denbigh 


provided by human consciousness, the same ordering is achieved by using an 
‘entropy clock’ (involving some chosen physico-chemical process), or by 
judging the order relative to the outwards flow of radiation or of particles from 
a source, or again by judging the order relative to some biological growth 
process of the type of rings in a tree trunk. Of course there will occur exceptions 
to this statement if fluctuation phenomena are significant, but this is very 
unlikely except in very small systems. 


3 IRREVERSIBILITY DEFINED 


The notion of irreversibility has been taken so far as tactily understood, but it 
needs now to be expressed more exactly. In fact irreversibility is a matter of 
degree and for this reason it is best defined as the negation of reversibility 
which ts an idealized and limiting kind of process, not capable of being fully 
realized. 

Reversibility and its negation are characteristics neither of ‘things’ nor of 
theories, but only of the processes which can occur in ‘things’. Let us 
concentrate attention, at least for the present, on those macroscopic and 
inanimate 'things' which can be specified in terms of their temperature, 
volume and chemical composition, together with the intensities of any 
prevailing flelds. Such specifications are sufficient to fix the momentary 
macroscopic state of the entity (‘system’) in question; a process is a temporal 
succession of such states due to the changing of one state into another.* 

A process is sald to be reversible if, and only if, the system which undergoes 
that process, together with all parts of its environment which are affected, can be 
restored reproducibly? to their original states. For example, let the system go 
from an initial state A, through states B, C, etc., to a final state X. The 
corresponding simultaneous states of the affected environment are oc, fi, y, 
etc., up to a final state œ. There is reversibility. if it is possible not only for the 
system to be restored from X to A, but for this reversal to be accompanied by a 
simultaneous reversal of the affected parts of the environment from w to oc. In 
short all relevant parts of the universe must be capable of being put back to 
how they were! 

Although the foregoing definition of reversibility can be applied to all the 
classes of Section 2, it is easiest to apply, in a precise and mathematical sense, 
to the thermodynamic class. Some of the authors of textbooks dealing only 
with that class use an alternative definition; namely that a process is said to be 


* Quantum theory shows that the states of an Ideally isolated physico-chemical system constitute 
a discrete set. And of course the notion of ‘state’ tacitly supposes that physical entities cannot be 
in two or more states simultaneously. 

5 The condition of reproducibility—i.e. the attainability of reversal whenever it ts desired—is 
necessary because a momentary restoration of an original state can occur, in principle, by 
spontaneous fluctuation. This is discussed in Section 4. : 


The Many Faces of Irreversibility 507 


reversible if it can be made to proceed in the opposite direction by an 
infinitesimal change in the system’s environment. Yet a third definition says 
that a process is reversible if it passes through a continuous sequence of states 
none of which departs more than infinitesimally from an equilibrium state. In 
most physico-chemical situations there are no important differences between 
the three definitions. 

Complete reversibility is not actually attainable in the real world. Irreversibi- 
lity is the natural state of affairs, although the concept of reversibility remains a 
useful idealization for purposes of theory. In the case of physico-chemical 
systems the matter is, of course, closely related to the Second Law of 
thermodynamics: the overall entropy change is zero only in the limiting case of 
a reversible process; in all real situations the entropy of system plus 
environment increases. Thus in the former case the application of theory 
results in mathematical equalities; in the latter case it yields only inequalities. 

Irreversibility can, of course, be minimized under carefully controlled 
laboratory conditions; for example by using mechanical systems which are 
almost frictionless, or, in the case of electrical processes, by using superconduc- 
tors. A familiar example is the vaporization of a liquid by means of a piston and 
cylinder together with a heat reservoir. The liquid is vaporized by drawing out 
the piston so slowly that the pressure above the liquid is only very slightly 
lower than the equilibrium vapour pressure. The liquid takes in heat from the 
reservoir which is at a temperature only very slightly higher. Subsequent 
recompression at a pressure minimally greater than the vapour pressure 
results in recondensation, and the amount of heat restored to the reservoir is 
then only minimally greater than the amount withdrawn during the original 
vaporization. Thus a cycle has been completed on the substance in question 
and, at the same time, the ‘external world’, namely the heat reservoir, has been 
put back almost to what it was. 


4. T-INVARIANT THEORIES 


The consideration of irreversibility does not arise in those theories of science 
which are concerned solely with structure (whether this be the structure 
of atoms or of living creatures), but it does arise as soon as theory seeks to 
deal with motion, or with other processes of change. It is of great importance 
that all existing theories of the latter kind are ‘time-invariant’. That is to 
say the replacement of t by —t in the theory's equations makes no alteration 
to any of its predictions. This applies as strongly to relativity and to 


5 Remaining close to equilibrium ts a necessary condition for reversibility in the first sense, but is 
not a sufficient condition. This ts nicely shown by an example due to Allis and Herlin [1952] 
concerning gas expansion {nto a vacuum when 1t 1s made to occur by the successive breaking of 
an infinite sequence of membranes. 


508 K. G. Denbigh 


quantum mechanics’ as it does to Newtonian mechanics and to electro- 
magnetism.® 

The question arises: How can these t-invariant theories describe the 
irreversible processes of the real world? The answer, of course, is that they 
don’t! What they can be used to describe is the idealized limiting case which 
was discussed in Section 3. For instance, in the case of Newtonian mechanics 
the theory can be applied to the motions of bodies (such as the planets and 
stars) which are not surrounded by a resisting medium, and which do not 
undergo inelastic collisions between themselves. Otherwise corrections or ad 
hoc additions to the theory have to be made in order to achieve agreement with 
experiment. To be sure there are a number of philosophers who seem to 
suppose that the term ‘theories’ means the same thing as ‘laws of nature’, and 
who are thus led to the notion that, because the putative ‘laws’ are t-invariant, 
the two directions along the t-coordinate are entirely equivalent. ‘. . . the only 
plausible way’, wrote Mehlberg [1961], ‘of accounting for the fact that so 
many well-established and comprehensive laws of nature somehow conceal 
time’s arrow from us is simply to admit that there is nothing to conceal. Time 
has no arrow.’ 

The effect of such a claim is to make the notion of irreversibility appear 
entirely foreign to physical science. Yet this is not only contrary to the reality of 
irreversibility in human experience but it is also entirely contrary to what is 
accepted about the objective world in those sciences—notably biology, geology 
and astrophysics—-which deal with evolving systems.’ 

Let us briefly consider the logic of the matter. It is true that if a process can be 


7 [n the case of quantum mechanics the basic Schrödinger equation for the state vector y does 
not contain dt as a square but only as a first power. However, it is the square of iy which ts 
significant In regard to what is observable and, after allowing for this, it remains the case (as in 
the other theorles mentioned) that the replacement of t by —t makes no difference to the 
predictions. Nevertheless there continues to be lively discussion in the literature on the 
questions whether QM is fully t-Invariant, and on whether it ought not to be. Phenomena 
which are effectively irreversible certainly occur at the single-particle level—e.g. the decay of 
nuclei, the absorption of particles in photographic emulsion, etc. Then again the ‘measurement 
problem’ remains very puzzling and seems to involve Irreversibility at the micro-level. 
Furthermore the decay of neutral K mesons provides apparently good evidence that there are 
instances of fatlure of t-invariance at the atomic level. It may be that QM Is ‘incomplete’ 
precisely in regard to irreversibility. 

9 Tt should be added that t-invartance requires not only the replacement of t by —t but also the 
inverslon of those vector quantities which relate to the entities in question; for example, particle 
velocities and spins must be reversed in direction and, if a magnetic fleld is present, this too 
must be reversed. One then speaks of the system In question as being in its 'time-Inverted' state, 
and these inversions and replacements result in the predicted motions or changes being the 
same for 'ttme towards the past' as for 'time towards the future'. 

? Prigogine and his colleagues are prominent among those scientists who reject Mehlberg's view. 
Prigogine accepts irreversibility, and the reality of ‘time’s arrow’, from the start and he aims at 
embedding the existing t-invariant theories within a much wider framework. 


The Many Faces of Irreversibility 509 


made to occur with close approximation to reversibility then it requires a t- 
invariant theory for its description. (Equivalently a t-noninvariant theory 
describes irreversible processes.) On the other hand if a theory is t-invariant the 
processes it describes may or may not be capable of occurring reversibly; for the 
t-invariance of a theory is a necessary but not a sufficient condition for the 
reversibility of a process it purports to describe.! Other factors which are 
important relate to the proneness of a system's internal dynamics to develop 
Instabilities, and to the effects of quite minute disturbances originating in the 
environment. For example, it is a commonplace that t-invarlent laws may 
apply quite accurately to processes occurring at the molecular level, and yet 
that macroscopic systems containing vast numbers of molecules may behave 
irreversibly, at least during periods of time much shorter than the Poincaré 
recurrence period. And of course showing that this Is so provides much of the 
content of statistical mechanics. The ‘laws’ describing the behaviour of such 
systems become probabilistic in character, and also—because of the signifi- 
cance of acts of ‘preparing’ the system in question—the probabilities projected 
‘towards the past’ may not be symmetric with the probabilities projected 
‘towards the future’. 

No doubt those who support Mehlberg’s view can claim quite correctly that 
Poincaré’s famous theorem shows that a system containing only a finite 
number of particles must eventually return to a state arbitrarily close to its 
initial state.! The expected ‘recurrence time’ is, of course, immensely long— 
typically it is of the order 1010” years for a system consisting of the Avogadro 
number of molecules. When one speaks, as I have done, of the reality of 
irreversible processes one is saying that processes occurring in macroscopic 
systems are effectively irreversible; and indeed they are, during all periods of the 
order of the age of the universe—say a mere 10?° years! 

Of much greater significance are the fluctuation phenomena which 
represent small (and usually exceedingly small) deviations from the most 
probable state of a system. These too are very infrequent. For instance, a 
diminution of the entropy of a gram mole of helium by only a millionth part is 
not to be expected more than once, on the average, in each 1010? years 
(Denbigh [1981], p. 106). Even so the reality of fluctuations is confirmed 
experimentally by such phenomena as the Brownian motion and the blueness 
of the day sky.!? 


10 See also Bunge [1968] and Hobson [1971]. 

11 Potncare’s theorem was based on the classical mechanics but a somewhat similar theorem 
holds in quantum mechanics (Ono [1949]; Percival [1961, 1962]; Hobson [1971]). 

12 In the light of fluctuation phenomena the Second Law must be regarded as being probabilistic 
rather than absolute in character. Nevertheless its status as an ‘impossibility theorem’ can be 
recovered by reformulating the ‘law’ in such a way as to imply the impossibility of knowing 
when recurrence will occur. With this in view, Jaynes [1963] reformulated the Second Law as 
follows: Spontaneous decreases in the entropy, although not absolutely prohibited, cannot 
occur in an experimentally reproducible process. 


510 K. G. Denbigh 


Similar phenomena occur very commonly in the domains of biology and 
neurology, although they are known by names different from ‘fluctuations’. In 
biology they are the random mutations which provide the basis for natural 
selection. In the field of mental phenomena there is, I think, no single word 
which is used to denote those showers of disconnected thoughts, those random 
clusters of events in the brain, which, after conscious selection, provides the 
material for novel reasoning. As I put it elsewhere (Denbigh [1981], p. 161) 
'... human mental activity involves the processes of picking out, and of 
amplifying, from what the chance processes produce, whatever may be useful, 
say, for a plan of action, or for the creation of a new musical composition or a 
new theory.’ 


5 THE TEMPORAL REFERENCE DIRECTION 


When it is said that some process, say A — Z, is very unlikely to occur in 
reverse (with all its environmental effects also reversed) this is to say that the 
moment at which Z occurs is later than the moment when A occurs and that 
this is always the case—i.e. on every repetition of the process. For it is an 
important empirical fact that macroscopic processes are not normally (ie. 
apart from the exceedingly infrequent recurrence phenomena) observed as 
spontaneously proceeding in one direction on some occasions and then as 
spontaneously proceeding in the reverse direction on other occasions, relative 
to a temporal reference direction to be defined below. If this were not the case 
the world would certainly seem much more wayward and incomprehensible 
than it does! 

Thus irreversibility seems to presuppose 'time's arrow'? Or is it perhaps the 
other way round—i.e. that time's arrow is based on, and thus presupposes, 
irreversibility? 

It is, of course, another, although closely related, empirical fact that all 
sequences of events at a given location can be accomodated (as was said 
already in Section 2) within a single temporal order. Only one 'time' is needed 
at each location. That is to say, an ordering according to ‘later than’ suffices for 
every sequence of events of which we can be aware and this tncludes the events 
which are the receivings of signals from distant events such as super-novae. As 
is well known the relation ‘later than’, as applied to instants, has the properties 
of asymmetry, transitivity and connectivity which are required for the creation 
of a serial order. From this point of view the notion of an order of events 
logically precedes the notion of ‘time’. 

In the physics text books this order ts often represented as a straight line, a 
representation which goes back to Aristotle, if not earlier. But of course the 
temporal order has the important attribute which has just been discussed and 
which is not possessed by a straight line. A line, as such, has no intrinsic 
direction—there is nothing 'within' the line to distinguish 'leftwards' from 


The Many Faces of Irreversibility 511 


‘rightwards’.13 This lack of the quality of directedness in a line may be 
compared with the evident directedness of the sequence of real numbers since 
this sequence exhibits the relation of 'greater than', and this is intrinsic to the 
numbers. Thus if x and y are any two real numbers, negative, zero or positive, 
there exists the property which may be expressed: 


y>x=DF (ys x) (32) (x - z? - y). 


How does the matter stand in the case of the temporal order? In all that 
concerns our own consciousness we do indeed endow this order with a clear 
direction, for we are aware (presumably through the action of short-term 
memory) of events as being 'earlier than' or 'later than' each other. (To be sure 
we are also aware of events as being fleetingly 'now' or 'present', and whether 
or not this aspect of ‘time’ Is objective, in the sense of being Independent of 
consciousness, has been the subject of much controversy. However, the only 
relevant issue here is the bearing of irreversibility on the directedness, or 
‘arrow’, of the temporal order; 'nowness' is irrelevant.) 

Notice that if some particular process A > X can be reversed, together with 
all consequential changes in the environment, this would obviously not be 
taken to mean that ‘time itself’ had been reversed. Thus if the process A > X 
begins at time t4 and if the reverse process X — A is made to complete itself in 
the same physico-chemical system at time t4, we would certainly not wish to 
equate t4 to ta, as if time had indeed ‘gone back’ on itself in some sense. The 
good reason why we reject such a possibility is, of course, that the world is full 
of innumerable other spontaneous and distinguishable physico-chemical 
processes which continue to proceed unidirectionally during the period when 
the reversal of any one of them is made to occur. 

What is very significant, although very familiar, is that these multitudinous 
physico-chemical processes share a common attribute, namely that their 
overall entropy changes occur in parallel, as is asserted by the Second Law.!* 
Thus 


(ccSi— oc Sx) (sS; — g$1) 2 O, 


where « S and ș S refer respectively to the entropies of any pair, oc and f, of 
systems plus their environments, and i and k refer to any pair of instants 
irrespective of which of the instants is chosen by convention as having the 
character of being later than the other. It is notable that in this version of the 
Second Law, Schrédinger’s version, the subjective aspect of the judgment 
about ‘later than’ is by-passed. Even so it is convenient in science to adopt the 


13 Ordering along a line requires either the use of an external viewpoint, or that the line has an 
extreme element, a terminus. 

14 Ofcourse it is assumed that the processes in question occur on the macroscopic scale. Otherwise 
fluctuation phenomena could not be neglected. 


512 K. G. Denbigh 


convention that moment k is ‘later than’ moment i if the entropies increase 
from i to k. For in that way we avoid what would otherwise be a silly clash 
between ‘scientific time’ and ‘human time’. 

Returning to the title of this section, it will be seen that the reference 
direction can be chosen as based either on a consensus of all macroscopic 
processes or on some one long-lived process, such as the decay of a radio-active 
element, which is used as a standard. Neither of these procedures would have 
the effect of relegating the status of the Second Law to that of a tautology since 
the empirical content of the law is the near-universality of the parallelism of 
entropy changes, as expressed by the foregoing inequality. 

Finally a few words about whether irreversibility implies that ‘time itself’ 
has a direction. In my view time cannot be regarded as an existent; it is not a 
real ‘something’ (although it is based on a real relationship). For the essential 
characteristic of ‘things’ is their persistence in time and it would clearly be 
vacuous to say that time persists in time. Although it would be out of place in 
this article to discuss absolute v. relational views of time, I belleve that ‘time’, at 
any one location, is nothing more than a metric based on the relationship of 
‘later than’ as it pertains to events at that location.!? This Aristotelian view is, 
of course, entirely consistent with special relativity whose function is to relate 
the temporal orders at different locations, using an assumption about the 
maximum speed of signalling. It may be noted too that it is a bad linguistic 
usage to speak of events (as distinct from 'things') as being in time, since it 1s 
events which are constitutive of time.!* 


6 IRREVERSIBILITY HAS NO SPATIAL ANALOGUE 


Before dealing with this issue one must first ask: In terms of what items of 
language should the issue be expressed? Clearly not by saying that one can 'go 
along' time in one direction only, and that an analogous restriction does not 
apply to space. For ‘going along’ is itself a temporal notion and therefore the 
correct spatial analogue could not be 'going along' space. 

What is required, I think, is the making of a comparison between a sequence 
ofevents, E, E2, . . ., and a series of locations L;, L2, . . . along a straight line. We 
have to inquire what property is conferred on the event sequence by the fact of 
its irreversibility which has no analogue in the case of the location series. Of 
course the ordering relation for the one is ‘later than’ and for the other it is ‘to 
the right of’, but this is not the relevant distinction. What is relevant ts the 


15 As noted by Bohm [1987] ‘the concept of time must involve both irreversible process and 
recurrent (cyclical) process’, for it is the latter which provides a reliable measure. 

16 The issues discussed In this section are dealt with more thoroughly in my book Three Concepts of 
Time [1981]. 


The Many Faces of Irreversibility 513 


existence of entropy, and the fact that entropy is a function of a body’s state, but 
is not dependent on its location. (That state is, of course, an event in the body’s 
history.) What is also very important is the Second Law which, as was seen in 
the previous Section, establishes a parallelism between the entropy change of 
any one body and the entropy change of any other. (The term ‘body’ is here 
intended to include all relevant parts of the environment.) In short the event 
sequence E;, Ez, ... relating to the body can be re-expressed as an entropy 
sequence Sı, S2,..., and one then obtains the relationship, as already quoted 
in Section 5, between the entropy changes of any two bodies. 

The significant point in the present context is that there is no analogous 
parallelism involving the locations of two or more bodies. Thus there is no 
irreversibility in space and there is no spatial counterpart of entropy. 

Thermodynamics thus goes far beyond special relativity in pointing up the 
distinction between time and space. (Remember that Einstein himself accepted 
that one cannot telegraph into the past!) Whiteheadian philosophy makes 
the same claim, although in very different terms. The world is seen as creative, 
and the temporal process of producing what is new is the fundamental 
reality. 


7 DOES IRREVERSIBILITY NECESSARILY INVOLVE ENTROPY? 


As was seen in Section 2 there is uncertainty about whether or not the 
irreversible outwards flow of radiation, or of particles, into space is character- 
ized by entropy increase, as are the other familiar physico-chemical processes. 
Below I raise the same question about phylogeny, ontogeny and mental 
activities—i.e. the question about whether the irreversibility of these pro- 
cesses, considered in themselves and in abstraction from metabolism and other 
physiological action, is an entropic kind. This issue is, of course, closely bound 
up with the project of reductionism, and it is also bound up with the mind/body 
problem. 

Leaving these latter questions aside, it needs first to be said that, because 
entropy is a physico-chemical quantity, it requires the attribution to it of a 
spatial location. For example, if it were claimed that biological evolution is, in 
itself, an entropic process it would need to be asked whether the entropy 
increase in question is held to be located in the total mass of all living 
organisms, or in the whole eco-system or in the Earth’s biosphere. Clearly the 
answer cannot be obtained by experiment, but it is conceivable that it might be 
obtained by theoretical argument. 

However, this proves to be a mirage. For one reason because living creatures 
are ‘open’ systems, in the sense that they exchange material, as well as energy, 
with their environment. An estimation of their rate of entropy production 
would therefore have to take account of all consequential changes in their 


514 K. G. Denbigh 


surroundings, and this would be exceedingly difficult. For another reason 
because they are such complicated systems anyway, and this defeats the 
possibilities of calculation which might otherwise be achievable by the 
methods of statistical mechanics. Faced with these difficulties, a number of 
biologists have leaned rather heavily on information theory but, in my view, 
the value of what they have claimed is vitiated by the confusion which exists 
concerning the significance of the familiar expression }'p, Inp. As I have 
argued elsewhere [Denbigh and Denbigh, 1985], this measure does not mean 
at all the same thing in Shannon's information theory as it does in statistical 
thermodynamics. Identity of mathematical form is not sufficient for this to be 
the case, for it is a matter of what the p, and the summation, refer to. Thus 
what Shannon dubbed ‘entropy’ is not the established entropy of thermodyna- 
mics! : 

Yet another diffüculty in answering the question raised in the section 
heading lies in the matter of whether It is indeed legitimate to consider 
processes such as phylogeny, ontogeny, and also the mental processes, in 
abstraction from the biochemical and physiological processes which necessar- 
lly accompany them. These latter processes can be identified and studied 
separately in the laboratory, and they are undoubtedly entropy pro- 
ducing when they occur spontaneously. To suppose that phylogeny, 
etc., are something distinct is uncertain—and yet that is clearly what we have 
in mind when we regard, say, evolution as being its own kind of natural 
process! 

Perhaps it is most reasonable to think of the processes of phylogeny, etc., as 
being linked in no more than a contingent manner to the underlying bodily 
processes. By this I mean that there is no 1:1 relationship—e.g. in regard to 
dependence on mass. For example a speciation event, occurring over a long 
period, is just as much the coming into existence of a new species whether it 
occurs in small creatures or in much larger ones having greater metabolism. 
Ontogeny, too, remains the same distinctive process of cell differentiation, and 
of the development of tissues and organs, quite irrespective of the size of the 
particular organism, and is thus independent of the overall entropy produc- 
tion. Similarly again in the case of mental activity: different people differ 
immensely in regard to the amount of profound new thought they can 
produce, even though the amount of physical energy dissipated in their brains 
does not vary very much from one of them to another. 

One is at least safe in saying that it is an entirely open question whether 
these forms ofirreversibility are entropic or not. For certainly the occurrence of 
entropy increase can be established with reliability only in the case of what, in 
Section 2, I called the thermodynamic class of processes—e.g. diffusion, 
mixing, etc—since it is only the members of this class which can be made to. 
occur in closed laboratory systems. 


The Many Faces of Irreversibility 515 


8 FANNING OUT TOWARDS THE FUTURE 


There is no really comprehensive theory of irreversibility and it seems unlikely 
‘that such a theory could ever be created in view of the great variety of 
irreversible processes. My own limited aim has been to argue that irreversibi- 
lity is a broader concept than is entropy increase, and to suggest that the 
common feature of irreversible processes is that they display divergence 
towards the future.!? 

As will now be said, this divergent quality has three distinct forms: 


(a) a branching towards a greater number of distinct kinds of entitles; 


(b) a divergence from each other of particle trajectories, or of sections of wave 
fronts; 
(c) a spreading over an increased number of states of the same entities. 


As was seen in Section 2 the first of these forms is exemplified in biology where 
the evolutionary 'tree' continually broadens!? due to the fact that each species 
develops variants (arising from mutations and from natural selection acting in 
different habitats) and these variants may eventually, over the course of many 
generations, become so different as to constitute genuinely new species. 
Simultaneously there is a corresponding ‘chemical branching’: the various 
new species of organisms start to produce new sorts of organic molecules 
according to what is of selective benefit to them. And again in ontogeny where 
the cells of the embryo start to branch off into a great variety of new sorts of 
cells which go to form the organs and tissues of the adult organism. In each of 
these instances of branching there occurs a multiplication of types of entities, 
of kinds of material 'things'. 

Something similar is characteristic of mental irreversibility, although the 
entities which branch and multiply are no longer material, except perhaps in 
some neurological sense as brain imprints. For here it is a matter of a continual 
increment (at least until senility) of new items in the mind, adding on to what is 
already present. As has been said, once we have known something we can 
never unknow it. 

The second form is exemplified by the spreading outwards from a point 
source, as was discussed in Section 2. Divergence in the form of the bifurcation 
of dynamic pathways is familiar from the work of Prigogine [1980] and the 
Brussels school. As is well known it is also displayed in the phenomenon of 
'chaos', as is nicely illustrated in a recent paper by Thompson [1989]. 


17 The phrase 'towards the future' is an abbreviation for 'towards times later than any arbitrary 
instant t', and it does not imply a commitment to the well-known A-theory of time which 
asserts the physical reality of 'past', 'present' and 'future'. 

18 Ofcourse I am not here concerned with ‘for ever’, but only with present trends as they occur on 
our particular plant. 


516 K. G. Denbigh 


What I now want to show is that the ordinary thermodynamic trreversi- 
bility, entropic irreversibility, can also be understood as a divergence towards 
the future; it is a branching towards an increased number of states of a given 
macroscopic entity, and not a branching into different entities or trajectories. ` 

It will help to clarify the significance of divergence in the thermodynamic 
context by first posing the question! Why do spontaneous physico-chemical 
processes ever occur? This involves two separate issues: 


(i Why is it possible to extract from the environment (or to prepare 
artificially) a system Which is capable of spontaneous change? 

(ii) Having obtdined suth a system, and having isolated it as completely as 
possible, why does it cüritinue to change up to a final equilibrium state, and 
why do all such processes have an attribute in common, namely a very 
high probability of entropy increase? 


The answer to the first question must be referred back to the conditions 
prevailing at the Big Bang. Perhaps this may sound a little pretentious! But 
consider what may appear at first sight to be a trivial question, an instance of 
(i) above: How is it possible to prepare a laboratory system which is not at 
equilibrium? For instance, let it be the system consisting of a block of hot metal 
lying on top of another block which is cold. It will be clear that the preparation 
of any system in which there is a temperature difference requires the 
availability of an energy input, and that the very possibility of having this input 
must be traced back to the Earth’s resources of coal, oil or uranium, and that 
these resources, in their turn, have their origins in the early history of the 
universe as a whole. Even the trivial act of placing the one block on top of the 
other requires muscular effort, and beyond that the intake of foodstuffs, the 
occurrence of photosynthesis in plants and of nuclear reactions in the Sun,... 
Considerations of this sort make it clear, I think, that all possibility of physical 
change is an inheritance, so to say, from the vast potentiality for change which 
existed in the primitive universe. This view of the matter is well supported by 
the existing Big Bang theory.!? 

The second issue above is very familiar. For present purposes it will be 
sufficient to summarise how the answer to it relates to the concept of 
irreversibility as a divergence. Consider some macroscopic system, isolated as 
well as can be achieved and thus of nearly constant energy, and let W be the 
number of energy eigenstates accessible to the system when it has that energy 
and is at equilibrium. All of the W states are assumed to be equally probable— 
Le. equally likely to be occupied by the system at any Instant. Let Spp be a 
quantity related to W by the equation Sgp=k InW where k Is Boltzmann's 
constant. This quantity was shown by Boltzmann and Planck to behave in a 
manner closely similar to the thermodynamic entropy S. To the extent that 


1? For further discussion on the cosmological understanding of the Second Law, see for example 
Gold [1958, 1967, 1974], Gal-Or [1974, 1975), and Davies [1974]. 


The Many Faces of Irreversibility 517 


this is true the change of entropy, S2 — 81, between an equilibrium condition, 1, 
and a later equilibrium condition, 2, due to the lifting of a constraint on the 
system, is given by 


S2—S, =k In(W2/W)). 


Now by the Second Law, for any isolated system the entropy change, 8; — $1, 
can only be positive or zero. The former case, where W2> Wi, corresponds to 
the situation where the transition from the equilibrium condition 1 to the new 
equilibrium condition 2 can only occur irreversibly (ie. they are different 
equilibria). 

Increase of entropy, due to irreversible passage from one equilibrium 
condition to another, can thus be interpreted?’ as an increase in the number of 
quantum states accessible to that system at constant energy. Physico- 
chemical irreversibility thus shows itself as a branching into a larger number 
of possible states of existence; it is a spreading or dispersal of the system over 
those of its eigenstates which are available for occupation when the system's 
energy has a fixed amount.?! 

This completes my phenomenological survey of the different kinds of 
irreversibility, biological and mental as well as physico-chemical. If 1 am right 
in thinking that their common feature is a branching or divergence towards 
the future, this would seem to entail increasing richness and diversity in the 
world. My view is thus not unrelated to Bohm's concept of an unfolding. It also 
has an affinity with certain much older insights—notably that the future is 
open and that whatever can possibly occur will occur. 

Iam greatly indebted to Dr Harmke Kamminga for many corrections to the 
manuscript, and for valuable suggestions for its improvement. 


Department of the History and Philosophy of Science 
King's College 
London 


20 There are other attempted interpretations of entropy, e.g. as disorder, disorganization, lack of 
Information, etc. but counter-examples can be brought against all of these. See, for example, 
Denbigh, K. G. and Denbigh, J.S. [1985] and Denbigh [1989]. 

?! [t will be appreciated that it has not been necessary for me to deal with the vast feld known as 
‘non-equilibrium thermodynamics' which is concerned with giving a significance to entropy 
during the temporal period when a physico-chemical system is actually undergoing a process 
of change. 


REFERENCES 


ALLIS, and HERLIN, [1952]: Thermodynamics and Statistical Mechanics. McGraw-Hill. 

BALSLEV, A. [1987]: in J. T. Fraser et al. (eds.) Tirne, Science and Society in China and the 
West. University of Massachusetts Press. 

Buu, H. F. (1968): Time's Arrow and Evolution. Princeton University Press. 

Boum, D. [1980]: Wholeness and The Implicate Order. Routledge and Kegan Paul. 


518 K. G. Denbigh 


BouM, D. [1987]: Foundations of Physics, 17, 667. 

Bunce, M. [1968]: Philosophy of Science, 35, 355. 

Davies, P. C. W. [1974]: The Physics of Time Asymmetry. Surrey University Press. 

DE BEAUREGARD, O. C. [1963]: Le Second Principe de la Science du Temps. Editions du Seuil. 

DenaiGH, K. G. [1981]: Three Concepts of Time. Springer-Verlag. 

DENBIGH, K. G. [1989]: ‘Note on Entropy, Disorder and Disorganization', British Journal 
for the Philosophy of Science, 40, 323-332. 

DENBIGH, K. G. and DENBIGH, J. S. [1985]: Entropy in Relation to Incomplete Knowledge. 
Cambridge University Press. 

GAL-Os, B. [1974]: Modern Developments in Thermodynamics. Wiley. 

GAL-OR, B. [1975]: in L. Kubat & J. Zeman, (eds.) Entropy and Information. Academia, 
Prague. 

Gorp, T. [1958]: 11th International Solvay Congress, Stoops, Brussels. 

Gorp, T. [1967]: in T. Gold, (ed.) The Nature of Time. Cornell University Press. 

Gorp, T. [1974]: in B. Gal-Or (ed.) Modern Developments in Thermodynamics. Wiley. 

Grap, H. [1961]: "The Many Faces of Entropy’, Communications in Pure and Applied 
Maths, 14, 323. 

GRÜNBAUM, A. [1973]: Philosophical Problem of Space and Time. Reidel. 

HARVEY, P. H. and PARTRIDGE, L. [1987]: Nature, 326, 128; ibid. 329, 397. 

Hi, E. L. and GRÜNBAUM, A. [1957]: Nature, 179, 1296. 

Hosson, A. [1971]: Concepts in Statistical Mechanics. Gordon & Breach. 

Jaynes, E. T. [1963]: in K. W. Ford (ed.) Information Theory and Statistical Mechanics. 
Benjamin, New York. 

MEHLBERG, H. [1961]: in H. Feigl and G. Maxwell (eds.) Current Issues in the Philosophy of 
Science. Holt, Rinehart & Winston. 

Ono, S. [1949]: Mem. Fac. Eng. Kyusho Univ., 11, 125. 

PENROSE, O. & PERCIVAL, I. C. [1962]: Proceedings of the Physics Society, 79, 605. 

PERCIVAL, I. C. [1961, 1962]: J. Math. Phys., 2, 235; ibid. 3, 386. 

Popper, K. R. [1956]: Nature, 177, 538; 178, 382. 

Popper, K. R. [1957]: Nature, 179, 1297. 

Popper, K. R. [1958]: Nature, 181, 402. 

Pricoaing, I. [1980]: From Being to Becoming. W. H. Freeman & Co. 

ScurpPER, K. & WANG Hsru-Hugi [1987] in J. T. Fraser et al. (eds.) Time, Science & Society 
in China and the West. University of Massachusetts Press. 

Simpson, G. G. [1950]: The Meaning of Evolution. Oxford University Press. 

THOMPSON, J. M. T. [1989]: Proceedings of the Royal Society, A421, 195-225. 

Wuitrow, G. J. [1980]: The Natural Philosophy of time, p. 10, Oxford University Press. 


Brit. J. Phil. Sci. 40 (1989), 519-540 Printed in Great Britain 


Ernst Mach Leaves “The Church of Physics’ 


‘ JOHN BLACKMORE 


ABSTRACT 


A study of the published and unpublished parts of Ernst Mach’s last notebook 
(1910-14) suggests that Max Planck's attack (1908-11) provoked Mach into 
opposing “The Church of Physics’ more strongly than previously realized. Shortly 
after Mach threatened to leave the discipline if belief in atoms were required, Albert 
Einstein tried to persuade him to accept atomism (September 1910). Mach declined 
to mention Einstein again in his publications and increasingly criticized "The 
Church of Physics’. 

Evidence that Mach opposed relativity theory and the absence of evidence that 
he favored it is pointed out. It is suggested that Mach's alleged ‘friendly interest’ in 
Einstein’s work in early 1914 may have been stimulated by the hope that the 
young genius might develop a continuum or fleld theory to refute Planck's 
discontinuity physics. 

The paper concludes with suggestions on how philosophers who defend Mach's 
non-realism such as Gereon Wolters and Paul Feyerabend might be better off 
switching to a realist epistemology more compatible with rationally-held science, 
religion, and common sense. 


1 The Relativity Axiom 

2 The Light Constancy Axiom 
3 Mach Versus Planck 

4 Mach Leaves Physics 

5 Philosophical Implications 


I THE RELATIVITY AXIOM 


A look at both unpublished and recently published [Blackmore and Hentschel, 
1985] parts of Ernst Mach’s last notebook (1910-14) suggests that the debate 
with Max Planck over philosophy around 1910 played a more central role in 
Mach's life than this author had previously realized. It may also have been a 
factor in Mach’s opposition to Einstein's theory of relativity, an opposition 
which some philosophers have recently attempted to deny, philosophers who 
will be repeatedly brought to task in this paper. As usual, everything about 
Mach may seem controversial, but the gradual accumulation of correspon- 


520 John Blackmore 


dence and manuscripts by and to Mach should help channel or retire at least 
some of the differences of opinion. Let us start by refreshing the reader with 
reasonably familiar data about the historical context. 

Only a few months after Max Planck's initial attack on Ernst Mach's 
philosophy at Leiden on 9 December 1908 as injurious to physics, the Austrian 
physicist-philosopher became interested in the new physics of Einstein and 
Minkowski. Perhaps the new ideas could be used against Planck. But first they 
had to be understood, and that was not easy. Mach sent letters to many 
physicists including the young Philipp Frank in the hope of understanding 
Einstein's ideas better, but apparently by late 1910 Mach had already begun to 
suspect that there were fundamental philosophical differences, and that in 
terms of epistemology Einstein stood much closer to Planck than he did to 
Mach. By 1911 the die had apparently been cast. Mach had long rejected the 
reality of atoms, and contrary to Gereon Wolters in his recent book [1987] and 
Paul Feyerabend in his support of that position [1987], p. 218, there is strong 
historical evidence that Ernst Mach (1838-1916) now rejected at least part of 
Albert Einstein's special theory of relativity. No amount of philosophical 
argumentation can change historical facts. 'Save the appearances' phenome- 
nalism with its rejection of both atomism and statistical mechanics had 
apparently failed in physics and Planck pointed it out. Had he known about 
Mach's developing doubts about relativity theory, that would have merely . 
added grist to his mill. But whatis important and not yet fully recognized in the 
literature is that Planck's criticism of Mach begun in 1908 would apparently 
help drive the latter by 1911 into a more complete rejection of modern physics 
than would otherwise have occurred, a rejection which seems to have 
included one or more basic elements of Einstein's theory. 

Let us turn to some evidence for Mach's opposition to relativity, and later, to 
the reality of atoms and "The Church of Physics'. (If the reader wants to check 
more extensive older data, previously published material, he can turn to 
Holton [1968] and the books by Blackmore [1972] and Blackmore and 
Hentschel [1985].) 

Joseph Petzoldt (1862-1929), a Berlin gymnasial teacher, Privatdozent at 
the Technical University of Berlin, and philosopher of science and longtime 
supporter first of Richard Avenarius and then Ernst Mach, wrote to Mach on 
June 1, 1911: 


You wrote me last concerning the epistemological side of the relativity principle, 
that for you something still seems to be lacking. I also believe that... .! 


l ‘Ste schreiben mir zuletzt, das Ihnen am Relativitatsprinzip hinsichtlich der erkenntniskritis- 
chen Seite noch manches zu fehlen scheint. Das glaube ich auch . . .' (This letter like the others 
to Mach can be found in the Ernst Mach-Institut, Eckerstrasse 4, Freiburg im Breisgau, 
Bundesrepublik, Deutschland. A printed version of the whole letter can be found in Blackmore 
and Hentschel [1985] p. 91). 


Ernst Mach Leaves ‘The Church of Physics’ 521 


This seems clear evidence that Mach was dissatisfied with at least one of the 
two axioms of the special theory of relativity. Since Petzoldt apparently did not 
keep Mach’s original letter (or any others from Mach from 1906 to 1913), we 
cannot be exactly sure what the epistemological objection was. Wolters, 
however, has guessed that Mach’s objection was to what Don Howard [1987], 
p. 607, calls ‘special relativity’s restriction to inertial frames’, an objection 
presumably met by the general theory. But even if Wolters’ conjecture is true, 
it still means that Mach was dissatisfied with a vital part of Einstein’s special 
theory and may have put considerable weight on that dissatisfaction, hence 
Wolter's answer is unsatisfactory. But even more serious, how can he put so 
much confidence in unsupported speculation about one of the least ambiguous 
items of evidence extant in the literature on Mach’s attitude toward relativity 
as if that speculation allowed him to dismiss that evidence? Strange. 

These comments are not intended to detract either from Wolters’ extensive 
research and discoveries concerning Ludwig Mach in his recent book 
mentioned above, Don Howard’s kind defence of me tn his review of Wolters’ 
book, or Paul Feyerabend's kind wish to strengthen Mach's reputation in 
numerous articles and books, but the casual manner in which all three 
philosophers disregard the above remark by Petzoldt, which is far stronger 
evidence that Mach opposed Einstein's theory than anything they have found that he 
allegedly favored it, and which does not reflect especially well on their judgment 
or love of truth. 

There is no evidence that Mach ever withdrew his epistemological objection 
to the relativity axiom. Both Wolters and Feyerabend as non-realists have 
understandably struggled to make Mach compatible with modern science, 
possibly because the rational alternative, to adopt a more realistic epistemo- 
logy, bas not yet appealed to them. Don Hioward, who presumably is an 
epistemological realist, simply should have been more cautious in leaning 
toward Wolters' arguments. 


2 THE LIGHT CONSTANCY AXIOM 


Concerning the other axiom, the constant speed of light in a vacuum, this was 
strongly opposed by three of Mach's best-known supporters, Friedrich Adler, 
who rejected it as a 'Naturwunder', Mach's son Ludwig who thought it was 
incompatible with observation using the Mach-Zehnder interferometer which 
he had co-invented, and Joseph Petzoldt who often expressed his opposition in 
letters to Mach (1910-11). 

Petzoldt, like Anton Lampa and many of Mach's followers, was initially quite 
impressed by Einstein's special theory since they identified the relativity axiom 
with Mach's epistemological theory of relativity and because they thought the 


522 John Blackmore 


theory was ‘phenomenological’, but they couldn’t swallow the light constancy 
axiom. Petzoldt wrote to Mach on 22 September 1910: 


About Einstein's relativity theory as [Johannes] Classen represents it, I have still 
not gained a good understanding of it. Nevertheless, Einstein's basic insight 
seems most excellent to me. But it remains questionable whether he has 
completely freed himself form absolutes. For example, I don't yet see why the 
velocitles of light c and c' should be the same. It seems to me to make no 
epistemological sense at all? 


Whether Mach's epistemological objection was the same as Petzoldt 
mentioned here is not certain but remains a possibility. By June 1913 
Petzoldt apparently hoped that relativizing the constancy of light as early 
versions of the general theory seemed to him to do would allow Machians 
to accept relativity theory. But even as late as 1927 Petzoldt rejected 
Einstein's 'absolutism'. He continued to reject the constancy of natural laws.? 
Needless to say, Einstein did not accept either interpretation of relativity 
theory. 

Itisoften stated that Petzoldt increasingly favored Einstein's theory; perhaps 
this is imagined because so many of his articles and books seemed to support it 
and especially Mach as Einstein’s forerunner, a view which with the help of 
another of Mach's followers, Philipp Frank, has won considerable circulation 
and plausibility, but if the theory he accepted was significantly different from 
Einstein's understanding of his own theory, then it would be more accurate to 
consider him to have been a mere quasi or conditional supporter at best, and in 
important ways an opponent. One could not become a full supporter by 
relativizing the constancy of light or denying the constancy of natural laws. 
Klaus Hentschel is currently writing a major work on Joseph Petzoldt, and 
when it becomes published we should learn much more abut this enigmatic, 
self-declared positivist. In the 1927 letter cited above, it is also clear that 
Petzoldt gradually accepted that Ernst Mach had indeed opposed Einstein's 
theory in 1914. 

On the other hand, Einstein's theory was not phenomenological as Anton 
Lampa thought, Joseph Petzoldt wanted to believe, and as Gereon Wolters, 
with all due respect, apparently still believes. One could have machines 
measure the constancy of light and many predictions of the theory without the 
intervention of man, his sensations, or his consciousness. Consciously indirect 


2 Zu der Relativitatstheorle Einsteins, wie sie Classen dargestellt, habe ich noch kein genügend 
nahes Verhältnis gewonnen. Doch scheint mir Einsteins Grundgedanke ganz vortrefilich. 
Fraglich Ist mir aber, ob er sich ganz vom Absoluten losgemacht hat. Ich sehe z.B., noch nicht, 
warum die Lichtgeschwindigkeiten c und c' gleich sen sollen. Mir will es scheinen, als habe es 
erkenntnistheoretisch gar keinen Sinn...’ 

3 Joseph Petzoldt to Albert Einstein, Spandau, March 3, 1927. (Einstetn Archive, Boston 
University) 


Ernst Mach Leaves ‘The Church of Physics’ 523 


epistemology, objective epistemology, is still basic in science, as it has been 
since Galileo rejected the ‘save the appearances’ astronomy of Ptolemy and the 
direct realism of Aristotle. 

There is no evidence known to this writer that Mach ever came close to 
accepting any version of Einstein's special theory of relativity, the speculations 
of Wolters to the contrary. As for the general theory, it employed multi- 
dimensional geometry which may not have been fully compatible with Mach's 
theory of economy and which would be strongly opposed by Hugo Dingler who 
attempted to apply Mach’s theory to geometry. Mach himself was a strong 
supporter of Dingler and often praised him both in his correspondence and in 
the preface to the 1912 edition of Mach’s Mechanik. 

There is no evidence known to this commentator that Mach ever expressed 
himself concerning the light constancy axiom, but it is hard to imagine how he 
could have accepted it given Mach’s own epistemological theory of relativity. 
Mach had long believed that all physical phenomena had to have relations 
with other physical phenomena, but the light constancy axiom suggested that 
the velocity of light in a vacuum was independent of all other physical 
phenomena, that is, was ‘metaphysical’ from a phenomenalist perspective. 
The sceptical attitude of Mach’s main followers as mentioned above helps 
confirm this. 

One might add that if one makes a statistical study of the different types of 
objections to Einstein’s special and general theories from the book by Israel, 
Ruckhaber and Wenmann [1931] which lists Mach as an opponent of 
Einstein's theories as well as at least seventeen of his correspondents or 
followers, one will notice that as in the early letters from Petzoldt to Mach the 
most frequent declared objections were to the light constancy axiom. 

In addition to the seventeen above, Mach also corresponded with the 
following opponents of Einstein's theory: Leo Gilbert and M. Gandillot. It might 
be especially interesting if Mach's letters to Gilbert and Gandillot could be 
located, though Mach's usual practice of shifting the subject when correspon- 
dents discussed relativity should be kept in mind. 

Concerning the impression that Einstein's theory was ‘phenomenological’ 
which many Machists had around 1910, Lampa had written on 1 May 1910 
to Mach: ` 


If the provisional measurements which Ehrenhaft had carried out while I was in 
Vienna should be confirmed in the continuation of his examination into the 
charges on colloid particles, then electrons would be divisible. Even at that time, 
Ehrenhaft had found particles with half-electrons—since then, as [Professor 
Viktor von] Lang told me, also what has 1/3, 1/5 electrons have been observed. It 
would be too beautiful, if the electron should suffer the same fate as the atom by 


524 John Blackmore 


means of cathode rays. But however it may turn out, I think that the relativity 
theory ts the introduction to a phenomenological epoch of physics.* 


(About Felix Ehrenhaft, the reader may want to consult Gerald Holton’s classic 
study of him and Robert A. Millikan. [1979].) &u s 

Some years later, Anton Lampa seems to have accepted the reality of atoms 
and electrons with fewer mental reservations and apparently realized he was 
wrong about the ‘phanomenologischen Epoche' and possibly about relativity, 
as he appears to have admitted in a somewhat ambiguous letter to Einstein 
dated 30 May 1920: 


But at that time [1911-12 when Einstein was in Prague] I was not yet clear 
about your thoughts, and I wrestled with them to bring them into my 
epistemological thoughts or way of thinking—which perhaps for a way of 
thinking oriented on Mach seems [seem?] almost self-contradictory. And still itis 
80. ButI am, let us say, too mentally slow to win clarity by means of discussions. 
And so it happened as it happened. I have to find. my own way by myself.5 


3 MACH VERSUS PLANCK 


But Mach's overt criticism of the relativity principle and probable rejection of 
the light constancy axiom by no means reflected the full vehemence of his 
opposition to basic aspects of the new physics. The following quotations should 
make this evident. In his debate with Max Planck (1908-11), Mach [1910, 
p. 603] wrote as follows (which Friedrich Adler later published in book form 
[1919, pp. 11-12]: 


The essential difference between us concerns belief in the reality of atoms. . . . My 
answer is simple: If belief in the reality of atoms is so crucial, then I renounce the 
physical way of thinking, I will not be a professional physicist, and I hand back 


+ ‘Wenn dle vorläufigen Messungen, welche Ehrenhaft ausgeführt hatte, als ich jetzt in Wien 
war, bei dem Fortgang seiner Untersuchung über die Ladungen kolloider Teil chen sich 
bestätigen sollen, so ware das Elektron teilbar. Schon damals hatte Ehrenhaft Teilchen mit 
halben Elektronen gefunden—inzwischen soll er, wie mir Lang sagte, auch welche mit 1/3, 
1/5 Elektron beobachtet haben. Es würe doch zu schón, wenn auch das Elektron von 
demselben Schicksal erreicht würde wie das Atom durch die Kathodenstrahlen. Wie das nun 
auch sein mag, denke, dass die Relativitátstheorie die Einleitung zu einer phinomenologischen 
Epoche der Phystk ist.’ 

(The author thanks Professor Andreas Kleinert of the University of Hamburg for a copy of this 
and the following letter.) 

‘Aber Ich war damals mit Ihren Gedanken noch nicht im Reinen und ich rang damit, ste tn 
meine erkenntnistheoretischen Gedanken oder Denkwetse einzufügen—was vielleicht für eine 
an Mach orientierte Denkweise beinahe widerspruchsvoll erscheint. Und doch Ist es so. Ich bin 
aber, sagen wir, zu schwerfüllig, um in Diskussionen klatheit zu gewinnen. Und so kam es so, 
wie es kam—ich musste mir meinen Weg selber suchen.’ 


uw 


. Ernst Mach Leaves ‘The Church of Physics’ 525 
1 

my scientific reputation. In short, thank you so much for the community of 
bellevers, but for me freedom of thought comes first.* 


It must not be thought that in a polemical struggle with Planck that Mach was 
without allies. Ernst Lecher, for example, who followed Mach as professor of 
experimental physics in Prague, could write as late as January 1910 about 
Mach's philosophy: ‘Natural scientists are for the most part followers of Mach, 
but not so the professional philosophers.’ Adler [1909], Petzoldt [1910], and 
Frank [1917] all wrote articles supporting Mach against Planck. Lampa even 
published a book [1918] defending the virtues of his hero. Private letters of 
support came from Paul Jensen, Carl Cranz, and Albert Einstein. (See 
Blackmore [1972] pp. 222-3.) 

Max Planck as professor at the University of Berlin had a long arm. Mach, 
who had taught in Prague for twenty-eight years but had been retired (from 
the University of Vienna) since 1901, recommended his former assistant, 
Gustav Jaumann, as professor of theoretical physics at the German University 
of Prague in 1910, but Planck's recommendation of Albert Einstein 'as the 
Copernicus of the 20th Century' (Clark [1971] p. 135) persuaded the Prague 
Commission which included two Machists, Georg Pick and Anton Lampa, to 
recommend Einstein as the superior candidate. (See Blackmore [1972], 
Kleinert [1975], Havranek [1977], and Illy [1979].) 

Wolters [1987], pp. 130-1 has surmised that Einstein visited Mach 
personally in September 1910. Provided that he did not mention atomism, 
Prague, or Planck, presumably the first face-to-face contact should have gone 
well, but the younger man was not particularly tactful during this meeting and 
did his best to persuade Mach to accept the indispensability of the atomic 
theory. (See Weinberg [1937] p. 104, Frank [1947] pp. 104-5, and Cohen 
[1955] pp. 69-73.) 

This was clearly too much for the old man. Instead of recruiting a brilliant 
young physicist to fight Planck, he had invited the Berliner's favorite, an out- 
and-out atomist and prophet of the new ‘metaphysical’ orthodoxy, a second 
pillar to prop the Church of Physics. 

It was apparently at this time that Mach seriously decided to leave the 
discipline. Mach never mentioned Einstein again in print, only once in extant 
correspondence, and when people asked him about Einstein and relativity in 
his correspondence, as they often did, Mach apparently did everything he 
could to change the subject. 

Is there any evidence that Mach accepted Einstetn’s theory? None that this 


5 Hellbron 


. [1986] p. 55 translates the same passage: 


‘If belief in the reality of atoms is so essential to you, I will have nothing more to do with 
physical thinking. I will not be a proper physicist. I renounce all sclentific reputation, in a word, 
no, thank you, to the community of believers. Freedom of thought is more precious to me.’ 


526 John Blackmore 


author can determine, and none that most scholars would call evidence. But 
why does Gereon Wolters insist Mach was ‘positive’ toward the theory? 
Perhaps because he is a philosopher and thinks that reason and argument are 
more important than evidence. Doesn't he have any factual data? He has what 
he considers evidence, namely, an undated letter from Einstein to Mach in 
which Einstein thanks Mach for his 'friendly interest' in his work. (See Wolters 
[1987] chapter 14 and Blackmore and Hentschel [1985] pp. 109-10.) But 
Wolters has a reply to the objection that ‘friendly interest'—even if a fact and 
not just Einstein's conjecture—does not prove that Mach accepted anything. 
His reply is that he has evidence the letter was sent in early 1914 which means 
after the preface to the Optik was allegedly written by Ernst Mach, which could 
mean he had ‘friendly interest’ in the theory after he had supposedly rejected it. 
Wolters considers this inconceivable, hence he regards the letter as evidence 
that the anti-Einstein preface to the Optik was not written by Ernst Mach, as if 
Mach, unlike his son Ludwig, was ‘positive’ toward Einstein's theory, though 
whether he actually accepted it, that is, believed it, is sidestepped. If he has any 
evidence that Mach accepted or believed either the special or general theory, 
this writer has not been able to find it. 

Unfortunately for Wolters' argument, first, physicists who have studied the 
contents of the letter are not agreed that it was sent in early 1914. Three 
physicists, Gerald Horton, Abraham Pais, and John Stachel all give different 
dates ranging from 1911 to 1914. But, second, even if the above physicists 
were to accept his interpretation as Klaus Hentschel suggests they would and 
Wolters were correct about the dating, there is, as we have already shown, 
considerable evidence that Mach rejected Einstein's theory which does not 
depend on the anti-Einstein preface to the Optik, such that even if it were 
'forged' as Wolters has alleged, the weight of evidence would still be against 
Wolters’ revisionist thesis. Third, it is not at all certain that in fact Ludwig 
Mach wrote, i.e. 'forged', the preface and very doubtful that he wrote all of it. 
Fourth, even if he did write all of it, it may still accurately represent Ernst Mach's 
position concerning relativity. Any fifth, there would be nothing inconsistent in 
Mach rejecting the special theory on the one hand and even the general theory 
while still having 'friendly interest' in Einstein's work on the other, especially in 
the hope that the young genius could develop a continuum or field theory to annihilate 
Max Planck and the latter's discontinuity physics. 

This hypothesis is supported by the fact that Mach, when Einstein accepted 
Planck's invitation and moved to Berlin in early 1914 as if it were no longer 
possible to wean Einstein away from either Planck's physics or philosophy, gave one 
and possibly two original letters from Einstein to a friend as if he didn't want 
them any more, and second, in July 1914 refused a visit from the same friend 
[Joseph Petzoldt], which Mach's son later wrote was because his father didn't 
want to talk about relativity any more. (See Wolters [1987] chapter 15 and 
Blackmore [1972] p. 281.) 

But even 1f the likelihood that Mach accepted either the special or general 


Ernst Mach Leaves ‘The Church of Physics’ 527 


theory approaches zero percent inspite of Wolters, did Mach really dislike 
Planck and ‘The Church of Physics’ that much? Let us check the following 
- evidence and see. 


4 MACH LEAVES PHYSICS 


It was hard for Mach to leave ‘The Church’. Publishers wanted new editions of 
old books, and correspondents [O. Wiener and P. Engelmeyer] urged him to 
publish his manuscript on the history of optics. Nevertheless, Mach did leave. 
His Interests and publications after 1910 increasingly shifted to anthropology 
and psychology, especially to his book Kultur und Mechanik (1915) about how 
primitive people had learned to handle tools. (See Thiele [1963] pp. 219-21) 
Concerning the Optik, Mach sent a preface or introduction to Adler in 1910, 
possibly hoping that the young physicist-philosopher might have time to finish 
and publish the work, but the son of the head of the Austrian Social 
Democratic Party preferred socialism and politics and late in the same year 
became editor of the Zurich party newspaper. 

Mach turned to his son Ludwig to finish and publish his book on the history 
of optics, which he did, but the son was so modest that he was reluctant to 
appear in print under the title of either editor or co-author, with the result that 
Wolters [1987] has accused him of forgery. Mach also brought his notebooks 
on physics to a close, basically by the end of 1911, but a few entries, mostly on 
subjects other than physics, straggled into it until 1914. (See Blackmore and 
Hentschel [1985] third appendix.) 

Let us givé more quotations from Ernst Mach directed against Max Planck 
and ‘The Church of Physics’. Naturally, they can be interpreted in different 
ways and may not have always presupposed opposition to Einstein’s relativity 
theory, but increasingly, relativity theory was becoming, along with Planck’s 
quantum hypothesis, one of the two major ornaments of modern physics, such 
that the more sweeping the criticism, the more likely that it would also be 
directed against the special and at least those early versions of the general 
theory which it is relatively certain Mach had some acquaintance with. 

In Mach's last notebook, which ts located tn the Ernst Mach-Institut, 
Eckerstrasse 4, Freiburg im Breisgau, written awkwardly in pencil with his left 
hand (his right hand had been paralysed since 1898) Mach made a list of items 
to think about. The following short quote ts the third item on the list and comes 
from page 121 between listed dates which would suggest 1912: 


Planck—Vulgáre Physik 


The next quote, also from Mach’s last notebook, which like the one above 
has been included in photographic form in Blackmore and Hentschel [1985], is 
the last sentence on its page and bears no apparent connection with previous 
sentences. The last word was spelt in what Klaus Hentschel considers an older 


528 John Blackmore 


German way inspired by Latin rather than according to modern German 
usage. The question Mach asked (apparently in 1913, where it was one of the 
last entries in the book) suggests both that he may have begun to allow for the 
possibility that atoms might be real, but also that he still had reservations. 


Atome nicht occult? 


The following letter was sent from Ernst Mach to Wilhelm Ostwald on July 23, 
1913, the same month in which the anti-Einstein preface to the Optik is dated 
(which Wolters argues is a forgery) and less than a month after a letter from 
Einstein in which Einstein praises and flatters Mach to the skies and mentions 
how much he depended upon Mach’s criticism of Newton's bucket experiment, 
which Einstein at about this time had apparently begun to call Mach’s 
principle. (See Einstein [1912], p. 39 footnote.) If Mach was sympathetic to 
Einstein’s ideas at this time, as the philosopher Wolters thinks, then it is hard to 
imagine how he could have written the last paragraph of the letter to Ostwald. 
Because of its importance, all of it is presented. (One will notice Mach’s 
exaggeration about the recency of intellectual communion with Dingler and 
Petzoldt. He had known the former for three years and the latter for almost 
twenty and they sympathized with and seem to have reasonably well 
understood most of Mach's ideas during those periods of time.) I am indebted 
to Klaus Hentschel for returning a copy of the letter to me. (I had originally 
sent the copy to the late Joachim Thiele who had planned to publish it.) The 
original is in the Ostwald Archive in the Akademie der Wissenschaft in East 
Berlin. 


Both main mistakes, the formally deflcient representation of the monistic point 
of view and slight progress from the side of official physics in acknowledging 
my standpoint, are well-known to me. Concerning the first mistake, I take it 
in the most cheerful spirit—though humor has not yet been enough to over- 
come it. To overcome the second mistake lies beyond my power, since at best 
I can merely voice my opinion, but can force neither the physicists nor the 
philosophers to accept it. It has scarcely been a year since independent 
young men like the mathematician Dingler, the philosopher Henning, Petzoldt, 
gymnasial teachers like Dr. Jacob in Vienna have made known by their 
deeds that they have understood me. I will still rematn alone for a long time or 
indeed always.” 


N 


‘Die beiden Hauptfehler, dle formell mangelhafte Darlegung des monist!schen Gedankens und 
der geringe Fortschritt der Anerkennung meines Standpunktes von Seiten der offiziellen 
Physik, kenne ich ganz wol. Den ersten Fehler getraue ich mich bel guter intellektueller 
Stimmung—Humor ist hiezu nicht genugganz zu beseitigen. Den zweiten Fehler zu bessern, 
liegt nicht in meiner Macht, da ich hóchstens meine Meinung üussern, diese aber weder den 
Physikern noch den Philosaphen aufdrüngen kann. Es ist kaum ein Jahr her, das unabhängige 
Junge Männer, wie der Mathematiker Dingler, der Philosoph H. Henning, Petzoldt, Gyrmnaslal- 
lehrer wie Dr. Jacob in Wien durch ihr Tun bekannt geben, das sie mich verstanden haben. Ich 
werde noch recht lange oder überhaupt immer allein bleiben.’ 


Ernst Mach Leaves ‘The Church of Physics’ 529 


If Mach felt intellectually alone and isolated as he wrote in this letter, that is, 
less than a month after having received a very flattering letter from Einstein, 
thén how could he have supported Einstein's theory? Einstein was the golden boy, 
the center of what Stephen Brush has called "The Second Scientific Revolu- 
tion’. To support Einstein's special theory in 1913 was the opposite of being 
isolated within physics. To be genuinely isolated one rejected both Planck's 
"Vulgüre Physik’ and Einstein's special theory. 

Let us now gradually approach the preface to Mach's book Die Prinzipien der 
physikalischen Optik, published in Leipzig in 1921. Wolters believes that the 
preface was forged by Mach's son Ludwig, but since Mach authorized his son to 
finish and publish the work, the expression ‘forgery’ is not appropriate. First, 
let us quote directly from Wolters’ own book and indirectly from Mach's 
favorite mathematician/philosopher, Hugo Dingler, who had impressed Mach 
by his defence of three-dimensional geometry in physics and who would later 
write against Einstein’s general theory. Dingler had kept a diary (see Wolters 
[1987] p. 271), and in rather laconic terms he described a visit to Mach's 
house in Vaterstetten, Bavaria, on July 5, 1913 (the same month both the 
preface to the Optik and Mach's letter above to Ostwald were ostensibly 
written): 


Questioned regarding work. Relativity theory. Thought about it afterwards. He 
[Mach] has written the 'Optik' three times. His son will have to finish it.? 


Second, there is strong evidence that Ernst Mach wrote at least part of the 
preface himself. The address given at the bottom of the preface 'München- 
Vaterstetten' was used in Ernst Mach's correspondence for a period of about 
three weeks in June 1913 and never before or after. None of the numerous 
extant letters from Ludwig Mach ever use the address. (The letters from Ernst 
and Ludwig Mach appear easy to distinguish. After his paralysis in 1898 Ernst 
seems to have always typed his letters. Ludwig apparently never typed his.) It 
seems that Vaterstetten, on the edge of which Mach lived from June 1913 until 
his death tn early 1916, either was not yet an incorporated town or the nearest 
postal station was Haar. Mail was supposed to be sent to ‘Haar bei München'. 
After three weeks of using what postal people apparently considered the wrong 
address, he switched over to the ‘official’ address. 

It is extremely hard to believe that Ludwig Mach seven years later in 1920, 
when Wolters thinks he forged the preface, remembered that for a period of 
three weeks his father had used a different address which Ludwig himself had 
never used and that Ludwig attached that address to the preface. As for 
Wolters’ recent rejoinder [1988, p. 506] that Vaterstetten was where Ernst 
Mach and his son lived and ‘was so to speak the only thing genuine in the 


* ‘Frug wegen Arbeiten. Relativitütstheorie. Haben darüber nachgedacht. ‘Optik’ habe er 
dreimal geschrieben. Die müsse sein Sohn fertig machen.' 


530 John Blackmore 


preface’, the address given in the preface was not 'Vaterstetten' but 'Vaterstet- 
ten-München', the same address as Mach used in letters shortly before July 1913, 
the explicit date of the preface. 

Furthermore, if Mach had written the Optik three times as Dingler above 
remembered hearing, then it seems more than likely that at least a draft 
preface already existed, probably more than one. And indeed, as already 
mentioned, Mach sent a preface or introduction to Adler in 1910 which still 
exists (see Blackmore and Hentschel [1985], first appendix). The extant 
introduction is epistemological in character and suggests that physicists 
should adopt ‘save the appearances’ phenomenalism. There is no mention of 
Einstein or relativity theory. 

Before presenting the relevant parts of the 1913 preface published in 1921, 
let us quote from a letter by Ludwig Mach (see Blackmore [1988], pp. 478-9 
footnote) about observation and relativity. The letter is to Friedrich Adler, who 
proofread the Optik, and is dated 3 March 1918, that is, about two years after 
Ernst Mach's death and three years before final publication. 


You will find little about relativity theory [the preface?] and nothing about 
radiation in the Optik—he explained to me repeatedly that these chapters are still 
much too unclartfied to be brought into the book. 

When the second part of the Optik appears, it will include a series of previously 
unpublished experiments, which we carried out together, and these deal 
precisely with those topics. 

In his time, I brought his attention to certain insufficiencies tn the Michelson 
experiments, and he confessed uncertainty about them; yes—and I promised him 
to carry out a controlled counter experiment ['Gegenversuch']. The resources are 
there only the time and quiet are lacking. . . . Up until his death, he [Mach] had a 
somewhat gentle ironic treatment for tons and the new ideas ['Anschauungen'] 
of the Relativists—.? 


The probability that Ernst Mach wrote a preface addressed 'Vaterstetten- 
München' and dated ‘Juli 1913’ does not exclude the possibility that Ludwig 
Mach revised or added sentences or even paragraphs to it. We simply do not 
know. Perhaps the best thing the reader can do is compare the preface to the 
letter above to Wilhelm Ostwald written during the same month for apparent 
similarities and differences concerning content and style. Below are the parts of 


? "Uber die Relativit&tstheorie werden Sie wenig, über die Strahlung gar nichts in der Optik 
finden-—er erklärte mir wiederholt, diese Capitel seien noch viel zu ungeklärt, um in die 
Darstellung aufgenommen zu werden. 

Als zweiter Teil der Optik erscheinen indessen eine Reihe von bisher unveróffentlichten 
Experimenten, die wir miteinander machten, und diese behandeln gerade diese Themata. 

Ich habe thn seinerzeit auf gewisse Unzulünglichkelten des Michelson Versuches auf 
merksam gemacht, und er gab die Unsicherheit zu; je—und ich versprach thm, den 
Gegenversuch auszufiihren—Die Behelfe sind da, es fehlt noch die Zeit und Ruhe . . . Er hatte 
bis zu seinem Tode etwas leise Ironisierendes für die Ionen und die neuen Anschauungen der 
Relativisten— 


Ernst Mach Leaves ‘The Church of Physics’ 531 


the published preface concerned with atomism, relativity, and ‘The Church of 
Physics’. 

On the other hand, the most Important thing is not who wrote the preface 
but whether it accurately represents Ernst Mach’s beliefs at that time. This 
author. thinks it does. 

Gerald Holton [1973], p. 230 has translated two paragraphs of the preface 
dated ‘July 1913’ as follows: 


I gather from the publications which have reached me, and especially from my 
correspondence [Petzoldt to Mach, June 15, 1913 and Einstein to Mach, June 25, 
1913], that I am gradually being regarded as the forerunner of relativity. I am 
able even now to picture approximately what new expositions and interpreta- 
tions many of the ideas expressed in my book on Mechanics will receive in the 
future from this point of view. It was to be expected that philosophers and 
physicists [Planck?] should carry on a crusade against me, for, as I have 
repeatedly observed, I was merely an unprejudiced rambler endowed with 
original ideas, in varied fields of knowledge. I must, however, as assuredly 
disclaim to be a forerunner of the relativists as I personally reject the atomistic 
doctrine of the present-day school or church. 

The reason why, and the extent to which, I reject [ablehne] the present-day 
relativity theory, which I find to be growing more and more dogmatical, together 
with the particular reasons which have led me to such a view—considerations 
based.on the physiology of the senses, epistemological doubts and above all the 
insight resulting from my experiments—must remain to be treated in the 
sequel.!? 


If we compare the preface with Mach's 1910 reply to Planck and the 23 July 
1913 letter to Ostwald we notice a number of similarities, particularly a 
persecution complex and his attitude toward physics and atomism. He 
specifically rejected atomism in the 1910 rejoinder to Planck, in the preface; 
and his defence of ‘monism’, i.e. phenomenalism, in the July 23 letter implied 
rejection. Physics is associated with a church both in the 1910 work and in the 
preface and is derogated as 'official physics' in the 1913 letter. As for 


10 ‘Den mir zugangenen Publikationen und vor allem meiner Korrespondenz entnehme ich, das 
mir langsam die Rolle des Wegbereiters der Relativitütslehre sugedacht wird. Nun kann ich mir 
heute ein ungeführes Bild davon machen, welche Umdeutungen und Auslegungen manche 
der in meiner Mechanik niedergelegten Gedanken von dieser Seite in Zukunft erfahren werden. 

Wenn die Philosophen und Physiker den Kreuzsug gegen mich predigten, so musste ich dies 
natürlich finden und war damit ganz elnverstanden, denn ich war, wie ich dies welderholt 
dargetan habe, auf den verschiedenen Gebleten doch nur ein unbefangener Spazierganger mit 
eigenen Gedanken, muss es aber nun mit derselben Entscheidenheit ablehnen, den Relativisten 
vorangestellt zu werden, mit welcher ich die atomistische Glaubenslehre der heutigen Schule 
oder Kirche für meine Person abgelehnt habe. 

Warum aber und inwiefern ich die heutige mich tmmer dogmatischer anmutende die 
Relativitütslehre für mich ablehne, welche sinnesphystologischen Erwägungen, erkenntnis- 
theoretische Bedenken und vor allem experimentell gewonnen Einsichten mich hierzu im 
einzelnen voranlassten, das soll ich in der Fort setzung dieses Werkes dargetan werden.’ 


532 John Blackmore 


persecution, in 1910 Planck is identified with a vindictive theologian or 
inquisitor, in the preface Mach alleges a crusade is being launched against 
him, and in the 23 July letter he bemoans his isolation and inability to 
overcome the negative attitude which ‘official physics’ has toward him. 
Particularly striking is the pessimistic tone and measure of self-pity in both 
the July preface and in much of the 23 July letter. It seems most improbable 
that Ludwig Mach even in 1913 let alone 1920 could have fabricated his 
father's attitude so flawlessly. One scarcely needs to add that the attack on 
relativity theory in the preface fits the general tone of the July letter so 
well, and especially the last line, as to seem the most natural thing in the 
world. 

One should add that both Friedrich Adler, who proofread the Optik and 
possessed a copy of the 1910 preface or introduction, and Albert Einstein, the 
target of Mach’s attack, perhaps the men best qualified to judge Ludwig's 
honesty, contributed money to help Ludwig Mach during the 1920's 
generosity scarcely likely if either had suspected Ludwig of either falsifying the 
book or distorting his father's views. (See Adler's correspondence with Einstein 
in the Archiv für Geschichte der Arbeiterbewegung in Vienna.) 

Did Mach become reconciled with ‘the church of physics’ and Einstein's 
theory of relativity after 1913? According to Einstein, Mach still showed 
'friendly interest' in Einstein's work in an unknown letter perhaps sent as late 
as January 1914, but Einstein's reply in his extant but undated letter that he 
respected Planck as a physicist 'scarcely second to anyone' perhaps did not 
endear Mach and the correspondence apparently withered, especially after 
Einstein joined Planck in Berlin. Mach also wrote a thank you letter to Joseph 
Petzoldt for writing an article on Mach as a forerunner to Einstein in Spring 
1914 while at the same time disposing of one or more letters by Einstein by 
sending the original(s) to Petzoldt. (See Blackmore [1972], pp. 276-7.) July 
1914 found Mach, as already mentioned in the story about Petzoldt's abortive 
visit to Vaterstetten, apparently reluctant to discuss relativity either directly or 
by correspondence. 

Two late letters should help settle the matter. (See Blackmore and Hentschel 
[1985] pp. 155-6 and 160-1.) Čeněk Dvořák, a physicist at Zagreb and 
favourite student of Mach during the 1870's, wrote Mach a letter dated 
August 19, 1915. Most unfortunately, while Dvořák kept many letters by 
Mach, he failed to keep the one of which this was the answer. Nevertheless, one 
does not even have to be able to read German to recognize how critical Mach 
still was of ‘official physics’ and presumably of that unmentionable subject: 
Einstein's theory of relativity. 


Dem was Sie von ubertriebenen Spekulation, Massensuggestionenen und 
Moderichtungen in der Physik schreiben, werden wol viele der besten derzeitigen 
Physiker beistimmen. 


Ernst Mach Leaves ‘The Church of Physics’ 533 


This letter so clearly indicates Mach's disenchantment with modern physics 
that anyone who wants to maintain that Mach favored this or that particular 
contribution faces an uphill task indeed. In English, Mach blamed physics for 
‘exaggerated speculation’ (and what was more speculative than the special 
and general thoeries?), ‘mass suggestion’ (professional physicists were leaping 
aboard the relativistic bandwagon), and ‘fashion-chasing’ (as if anything new 
and bizarre should be accepted, and what fitted that description better than 
relativity theory?). Even Gereon Wolters admits an utter absence of evidence 
that Mach was ‘positive’ toward relativity theory during the last twenty 
months of his life. (See Wolters [1987], p. 200.) 

On 12 February 1916, two months after Einstein published his general 
theory of relativity and two weeks before what would have been his seventy- 
eighth birthday had he survived that long, in what appears to have been his 
last letter, Mach wrote to Otto Wiener, a physicist who also opposed Einstetn.s 
theory: 


I, an ageing man, have no longer been able to keep step with the unantictpated 
development of optics.!! 


No, this writer cannot accept the opinion that Mach was 'positive' toward 
Einstein's theory. The weight of evidence seems overwhelmingly in the other 
direction, namely, that he rejected it and not just from 1914 to 1916 but much 
earlier. The philosophers Wolters and more recently Feyerabend are almost 
certainly wrong on this point. Though Mach may possibly have criticized an 
aspect of relativity theory as early as 14 July, 1907 in his personal notebook 
(see Haller and Stadler [1988], p. 204): 'Mass seems to grow with velocity. . . . 
One needs neither mysticism of mass nor mysticism of electricity', he still 
wasn't satisfied that he understood Einstein's theory in 1909 and early 1910, 
but by 1911 when he had already apparently decided to leave the church of 
physics he was already definitely sceptical of it on epistemological grounds as 
Petzoldt's letter of June 1, 1911 indicates. The chief ground (as Petzoldt had 
mentioned in his letter in the paragraph above the reference to Mach's 
objection) was of course that Mach like Petzoldt identified the physical world 
with sensations or at least with conscious experience while Einstein like Planck 
was leaning to the view that there was a real physical world and that it existed 
beyond everything sensory and conscious, that there could be a major 
difference between 'Sein und Schein'. 

On the other hand, Mach realized that his understanding of Einstein's theory 
was hardly sufficient at any point to justify publishing an opinion; nor did he 
want to discourage Einstein, either from developing continuity physics further 
or from becoming a philosophical follower (as he claimed to be at least in the 


11 ‘Mit der ungeahnten Entwicklung der Optik konnte ich der alternde Mann nicht mehr Schritt 
halten.' i 


534 John Blackmore 


dispute with Planck), but in terms of the understanding Mach did possess he 
appears to have been very sceptical of relativity theory. 

As for the preface to the Optik, even if Ernst Mach did not write all of it, and in 
spite of Wolters’ book the matter is still far from settled, it is not reasonable to 
suppose that the son deliberately or even inadvertently misrepresented his 
father’s views on such an important issue. As Ludwig wrote to Petzoldt on June 
29, 1920: ‘I can only repeat once again, my father was not a relativist as that 
term is currently understood? (See Blackmore [1972], p. 281.) 


5 PHILOSOPHICAL IMPLICATIONS 


When Max Planck suggested at the end of his Leiden speech in December 1908 
that Mach was a false prophet for opposing Boltzmann's kinetic theory of gases 
and the reality of atoms and extended his criticisms in his answer to Mach's 
reply of 1910, he seems to have underestimated the ramifications. (See Planck 
[1909] and (1910].) He was saying in effect that epistemological realists who 
believed that there was a physical world and that it existed independently of all 
consciousness could contribute more to science than non-realists who either 
denied there was a physical world or who reduced it to sensations, ideas, or 
'theoretical entities'. 

If the non-realist Mach rejected the reality of atoms and Etnstein’s relativity 
theory, then this suggested that non-realism was incompatible with important 
aspects of sclence and that any philosophy based on non-realism was probably 
false. 

Many philosophers then and now have been non-realists in the above sense 
and have refused to accept the evidence against Mach, since it could imply that 
their own epistemological position has been mistaken.!? This means, unfortu- 
nately, that as long as they insist on maintaining a non-realist epistemology 


12 «|. so kann ich nur noch einmal sagen, mein Vater hatte nichts Relativistisches im heutigen 
Stnne.' 

13 One should not only distingulsh epistemology from ontology but recognize that epistemological 
realism and non-realism do not understand ontological positions such as matter-monism, mind- 
monism, and mind-matter dualism in the same way. What most non-realists mean by matter- 
monism or materialism would be rejected as phenomenallism or idealism by realists. More 
generally, a non-realist's physical world is opposed as mental by realists, while a realist's 
physical world is condemned as metaphysical or even meaningless by non-realists. Realists 
themselves are divided between direct and indirect realists. The standard distinctions between 
realism and non-realism m this article go back to the 18th century debate over George 
Berkeley's tdentification of the physical world with what he called ‘ideas’. People who opposed 
that position commonly called themselves realists. One should also add that Immanuel Kant re- 
defined the terms ‘realism’ and ‘idealism’ in a way which was almost completely opposite to the 
standard definitions which can be found in this article and in most dictionaries and 
encyclopedias of philosophy—even 1n Germany. Philosophers who have been influenced by 
Kant have often accepted his non-standard or counter-definitions. 

It is interesting that whereas historians normally want to employ the most neutral and 
historically informative philosophical terms, definitions, distinctions, and categories, that 


Ernst Mach Leaves ‘The Church of Physics’ 535 


(be it phenomenalist, idealist, pragmatic, instrumentalist, operationalist, 
conventionalist, positivist, phenomenological, existentialist, or peculiarly 
Machian) that they will fight tooth and nail against Planck’s charge that Mach 
was a false prophet and also against the firmly supported opinion that Mach 
rejected the reality of atoms. In more detail, he opposed atoms as indivisible 
particles or ‘substances’, and apparently like his follower Gustav Jaumann was 
reluctant to accept atoms as divisible particles observationally because he 
could not see them, formally because he remained under the influence of the 
old Greek definition in which to be an atom meant to be indivisible, and 
philosophically because he found it difficult to accept either ions as substances 
or the reality of electrons as sub-particles. Nor did Felix Ehrenhaft's 
experiments in Vienna in 1910 [where Mach then lived] encourage accept- 
ance of electrons as determinate entities. 

And perhaps above all, many friends of 'save the appearances' phenomenal- 
ism will do their utmost not to believe that Mach rejected Einstein’s theory of 
relativity. They will be against these last criticisms of Mach not necessarily 
because in terms of their best judgment they regard them as literally false 
(Wolters admits that Mach rejected the reality of atoms and uses the adjective 
'positive' with respect to Mach's attitude toward relativity to 'suggest' an 
opinion which careless readers will misidentify with belief or acceptance, and 
Feyerabend before succumbing to Wolters held the somewhat whimsical 
position that Mach was right in rejecting Einstein's theory) but because such 
criticisms could put their own epistemology or philosophy in general in a bad light if 
widely publisized or accepted (see Feyerabend [1984] for his old view and [198 7 
p. 218 for his new one). 

Since many philosophers seem unable to accept the current weight of 
evidence about Mach's stand concerning Einstein’s theory of relativity, 
without abandoning their own non-realist epistemology, which they are not 
yet willing to do, this author as a semi-philosopher would try to make it easier 
for them by suggesting two relatively painless and practical ways to switch to 
' an epistemology about matter and relativity more consistent with and helpful 
to modern science. 

Most philosophers apparently adopt non-realism either because it seems 
more certain about sensory experience (Hume and Mach) or because it appears 


many if not most philosophers prefer party or factional ones which help put their own 
perspective in the best possible light, even 1f misleading about connections with earlier 
movements. In short, many philosophers not only tend to act as their own advertising agents as 
if persuasion were more important than truth, but think that the most ‘up-to-date’ party 
classification and definitions should be employed instead of the most traditonal or neutral. The 
result is that conscientious historians can almost never accept philosophical terms, definitions, 
distinctions, and categories at face value. One must find classification which will help us 
understand all philosophical perspectives in the most informative and fatrest way possible. 
Most philosophers do not want readers to understand other points of view in a neutral or fair 
manner. The use of slanted and deceptive classification is probably what angers responsible historians 
the most about philosophers. 


536 John Blackmore 


to give more scope to religion (Berkeley and Duhem). But if we study these 
options closely, we will observe that there are other possibilities which tend to 
be more fruitful. 

First, if certainty necessarily means subjective certainty such as direct 
experience or ‘phenomenological physics’ is alleged to provide, then non- 
realists might have a point, but if measuring instruments exist and can 
measure physical behavior even when no one is observing or conscious of the 
instruments, then there is objective evidence, even 1f people may be required to 
read the instruments, and even If one somewhat misleadingly continues to call 
such an approach 'phenomenological'. We can also use automatic cameras to 
photograph both the instruments and their readings. Nelther people nor 
consciousness have to be present. This means that for the vast majority of 
scientists objective evidence can be real and can be employed. 

Furthermore, objective evidence is almost always more reliable and 
informative than subjective evidence, more certain if one will, in what one 
might call the public domain. Regardless of how 'immediately certain' 
sensations might appear to an individual person, it only translates as ‘personal 
opinion' or 'eye-witness testimony' for other people, which in both science and 
lawcourts is normally considered less trustworthy, less certain about physical 
events than photographs, video tapes, machine measurements, and other 
objective evidence. Certainty in physical science means public certainty, what 
is based on objective evidence. 

If philosophers have accepted non-realism on the ground of certainty, then if 
they want public certainty or a higher degree of it, then it would seem wise to 
switch to what can offer the possibility of objective evidence, that is, realist, 
epistemology about a changing, spatially and temporally extended, physical 
world. 

Naturally, one can re-define words such as ‘physical’, ‘real’, ‘realism’, and 
'objectivity' or play other games to give the illusion that non-realism as 
understood in this paper can provide the same degree of public certainty as 
realism, but if one wants the fact of more certainty and that fact depends on ' 
objectivity about what is beyond everything conscious, such that objective 
evidence can become genuine in the strongest possible sense, then the switch 
should offer everything to gain and nothing to lose. 

And second, concerning a very touchy matter which one must be very 
respectful towaxd, let us enter the lion's den which few scholars have dared to 
do recently and discuss which epistemology would be most beneficial 
concerning religion, something which may be required to prevent non- 
historical factors from influencing historical judgment. It might seem at first 
glance that religious belief ought to be stronger than mere religious hope, but 
when carefully studied, at least by this author, tt turns out not to be true. The 
opposite of religion is not atheism, but despair. (Or as Dante put it about Hell: 


Ernst Mach Leaves ‘The Church of Physics’ 537 


‘Abandon all hope ye who enter.') Religion provides hope for the future. Hope 
is more basic to religion than belief and can outlast belief. 

Furthermore, while there may be enough evidence to justify rational hope 
that a personal God exists and personal salvation is accessible, it appears 
insufficient to justify rational belief, that is, if we assume as most people 
normally do, that rational hope requires possibility, rational belief probability, 
and that it is best to be rational and responsible about all we can. In other 
words, religious hope may be rationally Justifled where religious belief is not. 

Another strong point is that religious hope, especially loyalty to hope, that 
is, religious faith, is compatible with the apparent probabilities of science and 
common sense, even if they are allowed enough range of application to help us 
comprehend a physical world which is presumed to exist beyond everything 
sensory and conscious. But this ideal religious solution, which respects undimi- 
nished science and common sense while honoring possibility-based religious hope and 
loyalty to hope, requires the adoption of epistemological realism, allows the existence 
and understandability of a trans-sensory and trans-conscious physical world. 

Philosophical conversion to epistemological realism, could it take place, 
would also strengthen the discipline of philosophy itself by making it more 
practical and consistent with rationally-held science, religion, and common 
sense. Indeed, it may be possible to maintain that to be fully rational 
philosophy must be compatible with rationally-held science, religion, and 
common sense. 

And last but not least, such return to practical reason on a large scale could 
also make life easier for historians and scientists by making it psychologically 
unnecessary for philosophers to keep fighting probable historical facts. Such 
conversion could especially help Paul Feyerabend accept that Mach, the 
patron father of Positivist societies in Berlin, Moscow, and later Vienna, was a 
Positivist, atleast as that term was generally understood by philosophers in the 
early 20th century, help Gereon Wolters accept the reasonable argument that 
Max Planck was fundamentally right about Mach as a false prophet, and help 
both Wolters and Feyerabend accept the weight of evidence that Ernst Mach in 
his emotional dismay at Planck’s harsh and repeated criticisms almost 
certainly did oppose, not just the reality of atoms and Boltzmann's gas theory 
which he had long done, but part or all of Einstein's theory of relativity, and 
particularly ‘The Church of Physics’.14 


POSTSCRIPT 


Professor Feyerabend in a recent letter to the author of this article coura- 
geously changed his mind. He now sharply rejects Gereon Wolters' position 
that Mach was positive toward Einstein’s theory of relativity and that the anti- 
Einstein preface to the Optik was forged. This means that my article is only 
critical of Feyerabend's 1987 position on these issues, not his current one. One 


538 John Blackmore 


hopes that in light of the evidence against his two theses that Professor Wolters 
will show the same courage by changing his mind too. He will be the better 
man for it. 


Tokyo, Japan 


REFERENCES 


ADLER, F. [1909]: ‘Die Einheit des physikalischen Weltbildes', Naturwissenschaftliche 
Wochenschrift, 8, pp. 817-22. 

ADLER, F. [1919]. 

BLACKMORE, J. T. [1972]: Ernst Mach—His Life, Work, and Influence University of 
California Press. 

BLACKMORE, J. T. [1988]: ‘Mach über Atome und Relativitat—Neueste Forschungser- 
gebnisse' (translated by Herlinde Pauer-Studer), in R. Haller and F. Stadler (eds.) 
Ernst Mach—Werk und Wirkung. Verlag Holder-Pichler-Tempsky, pp. 463-83, 
Wien. 

BLACKMORE, J. T. and HENTSCHEL, K. (eds) [1985]: Ernst Mach als Aussenseiter. Verlag 
Wilhelm Baumüller, Wien. 

Crank, R. W. [1971]: Einstein—The Life and Times. The World Publishing Company. 

CLASSEN J. [1910]: Über das Relativitütsprinzip in der modernen Physik’, Zeitschrift für 
physikalischen Unterricht, 23, pp. 257-67. 

Conen, I B. [1955]: ‘An Interview with Einstein’, Scientific American, 193, July. pp. 69— 
73. 

EINSTEIN, A. [1912]: ‘Gibt es eine Gravitationswirkung, die der elektrodynamischen 





14 The issue of whether Mach was a forerunner of Einstein is an important one which can only be 
touched on in this paper. Historically, Mach may have Influenced Einstein to reject Newton's 
conception of absolute space, time, and motion, but given the different epistemologies of Mach 
and Einstein and the latter's emphasis on the constancy of light and thus differences between 
what they meant by 'absolute' one may gravely doubt if there are logical connections between 

' their respective grounds for opposition. 

Concerning the question whether Mach considered himself a forerunner of Einstein, there 
are still many unclarities. It would be natural tn the light of Planck's 1908 accusation that 
Mach was a false prophet for Mach to insist that Einstein was one of his ‘fruit’, and indeed, the 
three brief but vague published references of Mach to Einstein during 1909 and 1910 (see 
Blackmore [1972], pp. 252-3) might have been an effort to further that atm. But none of the 
published references suggested that Mach accepted Elnsteln's ideas and they were the last 
published references Mach made to Elnstein or his ideas during his lifetime. 

The situation is complicated by the fact that in the controversial anti-Einstein 1913/1921 
preface to the Optik there ts an attempt to deny that he was a forerunner, possibly through fear 
that admission would be used to falsely suggest that Mach accepted Einstein's theory of 
relativity. 

Mach’s letter to Joseph Petzoldt in Spring 1914 thanking him for writing an article on him as 
a forerunner complicates the matter even further. Possibly, Mach was simply writing a thank 
you letter while actually declining to believe that he was a forerunner. More likely, however, he 
had come around to accepting the evidence that indeed he had been a forerunner, at least in 
historical though not logical terms, and was thanking Petzoldt for emphasizing the matter, 
especially as still another reply to Planck. But forerunnership is one thing and acceptance quite 
another. At no point in any ofthe extant correspondence or in any publication did Mach either 
state or suggest that he accepted Einstein's special or general theory of relativity. 


Ernst Mach Leaves ‘The Church of Physics’ 539 


Induktionswirkung analog ist?’, Vierteljahrschrift für gerichtliche Medizin, Ser. 3, 
44, pp. 37-40. 

FEYERABEND, P. [1984]: ‘Mach’s Theory of Research and Its Relations to Einstein’, 
Studies in the History and Philosophy of Science, 15, pp. 1-22 (also included in Haller 
and Stadler [1988], pp. 435-62). 

FEYERABEND, P. [1987]: Farewell to Reason. Verso, London. 

FRANK, P. [1917]: ‘Die Bedeutung der physikalischen Erkenntnistheorie Machs für das 
Geistesleben der Gegenwart', Die Naturwissenschaften, 5, pp. 65-72. 

FRANK P. [1947]: Einstein—His Life and Times. Alfred A. Knopf. 

Hauer, R. and STADLER, F, (eds.) [1988]: Ernst Mach—Werk und Wirkung. Verlag 
Holder-Pichler-Tempsky, Wien. 

HAVRANEK, J. [1977]: ‘Die Ernennung Albert Einsteins zum Professor in Prag’, Acta 
Universitatis Carolinae—Historia Universitatis Carolinae Pragnesis, 17, pp. 114-30. 

HEILBRON, J. L. [1986]: The Dilemmas of an Upright Man—Max Planck as Spokesman for 
German Science. University of California Press. 

HENTSCHEL, K. [1985]: ‘On Feyerabend's Version of '"Mach's Theory of Research and Its 
Relation to Einstein”. Studies in the History and Philosophy of Science, 16, 387-94. 

Hegneck, F. [1966]: ‘Die Beziehungen zwischen Einstein und Mach dokumentarisch 
dargestellt’, Wissenschaftliche Zeitschrift der Friedrich-Schiller-Universitat Jena, 
mathematisch-naturwissenschaftliche Reihe, 15, pp. 1-14. 

Horton, G. [1968]: ‘Mach, Einstein, and the Search for Reality’, Daedalus, 97, pp. 636- 
73. 

Horton, G. [1973]: Thematic Origins of Scientific Thought—Kepler to Einstein. Second 
Printing 1974. Harvard University Press. 

HorroN, G. [1979]: The Scientific Imagination: Case Studies. Cambridge University Press. 

Howann, D. [1987]: Review of Wolters [1987] in Isis, 78, pp. 606-7. 

Luy, J. [1979]: ‘Albert Einstein in Prag’, Isis, 70, pp. 76-84. 

ISRAEL, H., RUCKHABER, E. and WEINMANN, R. (eds.) [1931]: Hundert Autoren gegen 
Einstein. R. Volgtlanders Verlag, Leipzig. 

IrAGAk1, R. [1982]: ‘Why Did Mach Reject Einstein's Theory of Relativity?’ Historia 
Scientiarum, Nr. 22, pp. 81-95. 

KLEINERT, A. [1975]: ‘Anton Lampa und Albert Einstein, die Neubesetzung der 
physikalischen Lehrstühle an der deutschen Universitüt Prag 1909 und 1910', 
Gesnerus, 32, pp. 283-92. 

LAMPA, A. [1918]: Ernst Mach. Prag. 

Maca, E. [1883]: Die Mechanik in ihrer Entwicklung historisch-kritisch dargestellt. 
Brockhaus, Leipzig. Seventh edition 1912. 

Macu, E. [1910]: ‘Die Leitgedanken metner naturwissenschaftlichen Erkenntnislehre 
und ihre Aufnahme durch die Zeitgenossen’, Physikalische Zeitschrift, 11, pp. 599- 
606 and Scientia, 7, pp. 225-40. 

Macu, E. [1915]: Kultur und Mechanik. W. Spemann, Stuttgart. 

Maca, E. [1919]: Die Leigedanken meiner naturwissenschaftlichen Erkenntnislehre und thre 
Aufnahme duch die Zeitgenossen. Sinnliche Elemente und naturwissenschaftliche 
Begriffe. 2 Aufsátze. J. A. Barth, Leipzig. 

Macu, E. [1921]: Die Prinzipien der physikalischen Optik. Historisch und erkenntnis- 


psychologisch entwickelt. J. A. Barth, Leipzig. AES. 


540 John Blackmore 


Pais, A. [1982]: ‘Subtle is the Lord . . .' The Science and the Life of Albert Einstein. Oxford 
University Press. 

PErZOLDT, J. [1910]: ‘Die vitalistische Reaktion auf die Unzulünglichkeit der mechanis- 
chen Naturansicht’, Zeitschrift für allgemeine Physiologie, 10, pp. 69-119. 

PLANCK, M. [1909]: ‘Die Einheit des physikalischen Weltbildes', Physikalische Zeitschrift, 
10, pp. 62-75. 

PLaNcK, M. [1910]: ‘Zur Machschen Theorie der physikalischen Erkenntnis. Eine 
Erwiderung', Physikalische Zeitschrift, 11, pp. 1186-90 and Viaria aars für 
wissenschaftliche Philosophie und Soziologie, 34, pp. 497-507. 

THIELE, J. [1963]: ‘Ernst Mach-Bibliographte'. Centaurus, 8, pp. 189-237. 

Watters, G. [1984]: ‘Ernst Mach and the Theory. of Relativity’, Philosophia Naturalis, 

^" — 21, pp. 630-41. 

WoLrzERS, G. [1987]: Mach I, Mach I, Einstein und die Relativitütstheorie—Eine Fälschung 
und ihre Folgen. Walter de Gruyter, Berlin and New York. 

Wo ters, G. [1988]: ‘Atome und Relativitát—Was meinte Mach?’ in Haller and 
Stadler, pp. 484—507. 


Brit. J. Phil. Sci. 40 (1989). 541-555 Printed in Great Britain 


Connectionism, modularity, and tacit 
knowledge 


MARTIN DAVIES 


ABSTRACT 


In this paper, I define tacit knowledge as a kind of causal-explanatory structure, 
mirroring the derivational structure in the theory that is tacitly known. On this 
definition, tacit knowledge does not have to be explicitly represented. I then take 
the notion of a modular theory, and project the idea of modularity to several 
different levels of description; in particular, to the processing level and the 
neurophysiological level. The fundamental description of a connectionist network 
lies at a level between the processing level and the physiological level. At this level, 
connectionism involves a characteristic departure from modularity, and a 
correlative absence of syntactic structure. This is linked to the fact that tacit 
knowledge descriptions of networks are only approximately true. A consequence is 
that strict causal systematicity in cognitive processes poses a problem for the 
connectionist programme. 


1 Tacit knowledge 

2 Modularity 

3 Connectionism 

4 Syntax 

5 Tacit knowledge again 
6 Conclusion 


I TACIT KNOWLEDGE 


It is natural to introduce the notion of tacit knowledge through Chomsky's 
work. In Aspects of the Theory of Syntax, he wrote ([1965], p. 8): 


Obviously, every speaker of a language has mastered and internalised a 
generative grammar [t.e. a system of rules] that expresses his knowledge of his 
language. This ts not to say that he is aware of the rules of the grammar or even 
that he could become aware of them... 


This notion of tacit knowledge of the rules, principles, or generalizations of 
language recurs throughout his work; and several different pieces of 
terminology are used to express the same fundamental point. 


542 Martin Davies 


Thus ([1965], p. 8), ‘what the speaker actually knows’ is equated with the 
speaker's competence. Then ([1976], pp. 164-5), in order to sidestep what are 
argued to be irrelevant objections based on intuitive connections—for 
example, between knowledge and justified belief, and between competence 
and ability—the technical term cognize 1s introduced, and is explicitly linked 
with tacit knowledge ([1980], pp. 69-70): 


The particular things we know, we also cognize. . .. Furthermore, we cognize 
the system of mentally-represented rules from which the facts follow. . .. And 
finally we cognize the innate schematism, along with its rules, principles, and 
conditions. ... Thus 'cognizing' is tacit or implicit knowledge . . . [Clognizing 
has the structure and character of knowledge, but may be and in the interesting 
cases is inaccessible to consciousness. 


(See also [1988], pp. 9-12.) 

Ordinary speakers know—in the familiar everyday sense—and also cognize 
facts about, for example, what various complete sentences mean. In addition, 
they cognize—even though they do not know in the ordinary sense—the facts 
from which those first facts follow. 

We might think of the first facts as stated by the theorems of a systematic 
theory. If we continue to focus on facts about what complete sentences mean, 
the systematic theory will be a semantic theory. Then, the basic idea would be 
this. Ordinary speakers cognize and know the facts stated by these theorems. 
They also cognize—even if they do not know in the ordinary sense—the facts 
stated by the axioms from which the theorems are derived in the theory. 

If we think of the issue in these terms, then itis easy to raise a major question 
which confronts any friend of the notion of tacit knowledge. 

There will always be extensionally equivalent theories: distinct sets of 
axioms from which we can derive the same theorems about, say, the meanings 
of whole sentences. Given that fact, does it make any empirical sense to 
suppose that an ordinary speaker tacitly knows, or cognizes, or has 
internalized, one set of axioms, rather than an alternative set from which Just 
the same theorems of the relevant kind can be derived? Does it make any sense 
to suppose that one theory is psychologically real, rather than another 
extensionally equivalent theory? This is essentially Quine's challenge [1972] 
to the empirical credentials of the notion of tacit knowledge. 

Following a suggestion of Evans [1981], I would aim to respond to this 
challenge by construing tacit knowledge as a certain kind of causal- 
explanatory structure which underlies, or is antecedent to, the pieces of 
knowledge that the speaker has concerning complete sentences. 

We can make the matn idea clear enough if we follow Evans in considering 
two semantic theories for a very simple little language L. This language has 
just one hundred sentences, constructed out of ten names and ten predicates. 
The names are ‘a’, ‘b’,..., ‘jf, and the predicates are 'F', ‘G’,..., ‘O. 


Connectionism, modularity, and tacit knowledge 543 


Consequently, the sentences are ‘Fa’, ‘Fb’,..., ‘Fj’, ‘Ga’, ‘Gb’,..., ‘Oj’. These 
sentences have meanings which-—as we theorists can see from the outside— 
depend in a systematic way upon their construction. Thus, all sentences 
containing ‘a’ mean something about John; all sentences containing ‘b’ mean 
something about Harry; all sentences containing 'F' mean something about 
being bald; all sentences containing 'G' mean something about being happy; 
and so on. 

The two semantic theories that we are to consider are both theories of truth 
conditions for L. They assign just the same truth conditions to the sentences of 
L; but they differ in their internal or derivational structure. (We could just as 
well consider theories of meaning strictly so called; but theories of truth 
conditions have the advantage of familiarity.) 

The first theory, Tı, is the listiform theory. It simply has one hundred axioms, 
one specifying the truth condition of each sentence of the language. The 
axioms of T; thus include: 


'Fa' is true if and only if John is bald 
'Ga' is true if and only if John is happy 


and so on. 

The second theory, T; is a structured or articulated theory. It has an axiom 
assigning a semantic value to each name of the language; and likewise, an 
axiom for each predicate. For the name 'a', for example, we have 


'a' denotes John 
and for the predicate 'F', for example, we have 


a sentence coupling a name with the predicate ‘F’ ts trueif and only if the object 
denoted by the name is bald. 


From the twenty axioms of T;, we can derive just the same truth condition 
specifications as those that can be derived trivially from the axioms of T1. The 
two theories are extensionally equivalent; though they are not, of course, 
logically equivalent. 

Suppose that there is a speaker who uses the sentences of L, with the truth 
conditions which both theories agree in assigning. What evidence can we 
imagine having, which would incline us to attribute to that speaker tacit 
knowledge of the articulated theory T;, rather than merely of the listiform 
theory Tı? This is the question with which Quine's challenge confronts us. But 
more important than this evidential question is a constitutive one. What 
would it be for a speaker to have tacit knowledge of T;, rather than merely of 
Ti? 

Evans himself gave a constitutive account of tacit knowledge in terms of 
dispositions ([1981], p. 328): 


544 Martin Davies 


I suggest that we construe the claim that someone tacitly knows a theory of 
meaning as ascribing to that person a set of dispositions—one corresponding to 
each of the expressions for which the theory provides a distinct axiom. 


He added that, for the account to work as intended, the notion of disposition 
must be understood ‘in a full-blooded sense’. Given such an understanding 
(p. 330): 


the ascription of tacit knowledge of T; . . . involves the claim that there is a single 
state of the subject which figures in a causal explanation of why he reacts in this 
regular way to all the sentences contatning [a given expression]. 


Thus, according to Evans's account, ascription of tacit knowledge of T; 
involves the attribution to the subject of twenty distinct dispositions, and 
twenty distinct causal explanatory states—one for each name and one for each 
predicate of the language. 

It is helpful to think of Evans's basic idea in the following way. In theory T2, 
but not in theory T, the derivations of truth condition specifying theorems for 
the sentences 'Fa' and 'Ga' involve a common factor; namely, the axiom for the 
name ‘a’. Likewise, the derivations of theorems for the sentences ‘Fa’ and 'Fb' 
involve a common factor. For tacit knowledge of T2, and not merely of Tı, we 
require that where there is, in the theory, a derivational common factor there 
should be, in the speaker, a causal common factor. Roughly, for a speaker to 
have tacit knowledge of a particular articulated theory, there must be a causal- 
explanatory structure in the speaker which mirrors the derivational structure 
in the theory. 

This rough idea requires a number of refinements (Davies [1987]). But for 
present purposes, it is sufficient to observe two attractive features of any 
refinement of the basic idea. The first attractive feature is that there can 
certainly be empirical evidence for or against a particular kind of causal 
structure in a subject. If attributions of tacit knowledge are basically 
attributions of structures of causal-explanatory states, then such attributions 
make perfectly good empirical sense; and they can, in principle, be grounded in 
* empirical evidence. Thus, we meet Quine's challenge. 

The second attractive feature is that the basic idea, and refinements of it, do 
not require that in order to have tacit knowledge of an articulated theory a 
speaker must conceptualize the axioms or rules of the theory. The basic idea 
leaves room for a distinction between tacit knowledge and propositional 
attitudes like belief (see Davies [1989]). 

In fact, the account does not require that there be any explicit represen- 
tations—doxastic or subdoxastic, personal or subpersonal—of the axioms ór 
rules that are tacitly known. Tacit knowledge can be realized by the presence of 
a processor rather than the presence of a collection of representational states, 
provided that the processing exhibits the requisite causal structure. The fact 


Connectionism, modularity, and tacit knowledge 545 


that the account does not require explicit representation in no way trivializes 
it; for there is all the difference in the world between processing with a 
structure that mirrors the derivational structure in T; and processing whose 
structure merely mirrors the derivational structure in T;. For example, a 
processor with an autonomous component for each sentence of L would meet 
the latter condition, but not the former. 


2 MODULARITY 


The notion of modularity can also be introduced via Chomsky's work. In 
Knowledge of Language [1986], he recommends distinguishing between 
internalized language—that is, I-language—and grammar. I-language is 
‘some element of the mind of the person who knows the language’ (p. 22); a 
grammar, in contrast, is a theory of I-language. A grammar is not a cognitive 
structure; it is a linguist’s theory. Now, Chomsky describes grammars as 
modular (p. 71); and his exposition of the current state of linguistic theory is 
under the heading ‘Modules of grammar’ (p. 160). In this use of the term, a 
module is a subtheory of a linguist's theory of I-language. 

` Ifa particular grammar is a correct theory of the I-language of a speaker, 
then the language faculty of that speaker can be characterized—at one level of 
description—by that grammar. If the grammar is modular, then the language 
faculty that it characterizes can itself be said to be modular; it is an articulated 
information processing system. Thus Chomsky says ([1986], p. 204): 


The general idea that the language faculty involves a precisely articulated 
computational system—fairly simple in its basic principles when modules are 
properly distinguished, but quite intricate in the consequences that are 
produced—seems reasonably well established. 


A module within the language faculty will be a subsystem that is characterized 
by a module of the grammar. 

: For a grammar to be a correct theory of Language is for it to be 
psychologically real, or tacitly known. And what this requires—according to 
Section 1—is that there should be causal common factors lying behind pieces 
of linguistic knowledge in the speaker, exhibiting a pattern that mirrors the 
way in which there are derivational common factors lying behind the 
pronouncements of the grammar. 

Thus far, we have two different notions of a module. We could label these 
notions. On the one hand, there are modules in the analytical sense: constituent 
subtheories of a theoretical characterization of a cognitive task. Chomsky's 
modules of grammar are modules in this sense. On the other hand, there are 
modules in the processing sense: constituent subsystems of a cognitive system. 
Fodor's modules (Fodor [1983]) are modules in this sense; although Fodorian 


546 Martin Davies 


modularity involves characteristics not required simply by modularity in the 
processing sense. 

But there are more notions of modularity than just these two. For just as the 
processing sense of modularity is the result of projecting the analytical notion 
to the tacit knowledge level of description, so we can introduce notions of 
modularity at other levels of description too. 

Suppose, for example, that we have identifled—at the level of theoretical 
characterization—two components C and D of some cognitive task. Suppose 
further that, as a matter of empirical fact, the cognitive system under study 
does perform the task in question by having inter alia component subsystems 
that carry out the subtasks C and D. This would be a highly non-trivial 
empirical fact. But it would leave open the further empirical question whether 
the parts of the brain that subserve the performance of task C are distinct from 
the parts of the brain that subserve the performance of task D. 

In fact, there are a number of more precise questions that we can ask when 
we move to the neurophysiological level of description. One which is of some 
importance is the question whether the geographical region of the brain 
implicated in task C overlaps, or is disjoint from, the region implicated in task 
D. The importance of this question is that, the more the respective regions 
overlap, the less likely it is that the brain in question could, in practice, be 
damaged in such a way as to disturb the performance of one task while leaving 
the performance of the other task intact. 

We thus have three different notions of modularity, belonging at three 
different levels of description and explanation. (These correspond very roughly 
with Marr's three levels: [1982], pp. 24—5.) Thereis a clear enough distinction 
between the analytical notion and the processing notion, even though they are 
closely related: one kind of modularity is a feature of theories, the other is a 
feature of systems. Both the processing notion and the neurophysiological 
notion specify an empirical feature of systems, but the distinction between 
these two notions is crucial nevertheless. 

For example, cognitive neuropsychology is the branch of cognitive psychol- 
ogy in which models of normal cognitive processes are evaluated in the light of 
data provided by observations of people with acquired disorders of cognition. 
The classical form of argument in cognitive neuropsychology is from an 
observed double dissociation of deficits to a claim about modularity. The 
systems X and Y that are responsible for the performance of the tasks A and B 
are argued to be independent systems or separate modules, on the grounds 
that performance of each of the tasks can be impaired while performance of the 
other remains intact. 

The cognitive neuropsychologist infers modularity from findings of dissocia- 
tions. But he does not generally infer absence of modularity from the failure to 
observe dissociations. Rather, if dissociation between the performance of two 


Connectionism, modularity, and tacit knowledge 547 


tasks is not found, then the cognitive neuropsychologist considers two possible 
explanations. One possible explanation is that the cognitive model is incorrect; 
the two tasks A and B are really performed by a single integrated system. The 
other possible explanation is that, although there are indeed two independent 
information processing systems present, psychologically unimportant features 
of neurophysiology prevent the systems from being damaged separately. (On 
these issues, see Coltheart [1985].) 

The cognitive neuropsychologist is theorizing about modularity at the 
processing level; but his arguments are complicated by the fact that 
modularity at that level might not be matched by modularity at the 
neurophysiological level. 


3 CONNECTIONISM 


The processing level—as we have so far characterized it—1s a level at which 
the description of a system is an interpreted (semantic, cognitive, or content 
using) description. The interpreted description is cast in the same terms as the 
theoretical characterization of the task at the analytical level; this is 
particularly clear if we think of the interpreted description as a tacit knowledge 
description. 

In fact, the simple equation of the processing level with the level of tacit 
knowledge description is potentially misleading. A description at the tacit 
knowledge level specifies the Information that the system draws upon. But a 
full description at the processing level should surely do more than that; it 
should specify, in addition, how the information is drawn upon. To the extent 
that the processing level is to be identified with Marr's level two—the level of 
the algorithm—the tacit knowledge level should be distinguished as a slightly 
higher level of description. (Peacocke [1986] labels it level 1-5.) What we 
really have is a hierarchy of levels of coarser and more detailed interpreted 
descriptions of the way in which the task is carried out. 

But as well as all these levels of interpreted description, there are also 
uninterpreted descriptions which are still different from descriptions at the 
physiological level. 

For example, those classical computational theorists who favour the symbol 
manipulation paradigm recognize a level of uninterpreted syntactic descrip- 
tion. This is not to say that every state which has an interpreted, or semantic, 
description also has a description as a representational state with a syntax. For 
a plece of tacit knowledge can be realized by the presence of a computational 
processor. But, what is insisted upon is that the representational states which 
constitute the domain of the processor should be syntactically structured 
states. Thus, Fodor says ([1987], p. 25): 

[The representational theory of mind] says that the contents of a sequence of 
attitudes that constitutes a mental process must be expressed by explicit 


548 i Martin Davies 


tokenings of mental representations. But the rules that determine the course of 
the transformation of these representations...need not themselves ever be 
explicit. . 


(Cf, Fodor [1985], p. 95.) 

The friends of parallel distributed processing (PDP) also recognize a level of 
uninterpreted description which is quite distinct from the physiologtcal level. 
At this level, the descriptions are in terms of activation at nodes or units, 
mediated by weights or strengths attached to connections between the units. 
Let us label this level of formal description of a connectionist system the 
network level. 

On the face of it, the availability of this level of uninterpreted description does 
not count against the validity of interpreted, or semantic, descriptions of a 
connectionist network. 

Indeed, just as the classical theorist recognizes representational states and 
computational processes as vehicles of semantic content, so too, the connec- 
tionist assigns semantic content to two kinds of patterns within networks. 

Some of the Information in a system is realized by particular patterns of 
activation of the units (Smolensky [1988], p. 6): 


The entities in the [network] with the semantics of conscious concepts of the task 
domain are complex patterns of activity over many units. Each unit participates 
in many such patterns. 


And some of the information is realized by patterns of weights attached to 
connections (p. 13): 


Patterns of activity representing inputs are directly transformed (possibly 
through multiple layers of units) to patterns of activity representing outputs. The 
connections that mediate this transformation represent a form of task know- 
ledge... 


In a connectionist network, then, the bearers of semantic content are complex, 
structured items. 

Some philosophical discussions give the impression that such an interpreted 
description of a connectionist network is of, at most, heuristic significance; and 
that the advent of connectionism brings nearer the demise of content using 
explanations in cognitive psychology. But really, the issue of interpreted or 
content using description and the issue of connectionism should be regarded as 
orthogonal. There are four positions that a theorist might occupy. 

One quadrant is for those friends of symbol manipulation who insist on the 
role of content using descriptions (e.g. Fodor [1987]). A second quadrant is for 
friends of symbol manipulation who prescind from content (e.g. Stich [1983]). 
There is a third position that is occupied by enthusiasts for connectionism who 
would altogether eliminate appeal to semantic content (e.g. Churchland 
[1988]). And there is a fourth box, to be occupied by connectionists who insist 
that content using descriptions are essential for psychological theory. 


Connectionism, modularity, and tacit knowledge 549 


The following remark by Smolensky ([1987], p. 101) seems to place him in 
that fourth box: 


the formal system is at a lower level than the level of semantic interpretation: the 
level of denotation is higher than the level of manipulation. . . . Both levels are 
essential: the lower level is essential for defining what the system is (In terms of 
activation passing) and the higher level is essential for understanding what the 
System means (in terms of the problem domain). 


But, whether or not any particular theorist clearly occupies the fourth 
quadrant, if we take Fodor's position as the canonical version of the symbol 
manipulation paradigm, then the appropriate comparison is with content 
using connectionism. 

So far then, we have no reason to deny that a connectionist system can have 
a true tacit knowledge description. Nor do we yet have any reason to deny that 
8 connectionist system may exhibit modularity at the processing level, or the 
tacit knowledge level. For all that this latter requires is that the network should 
have a true tacit knowledge description cast in the terms of a modular theory 
(that is, a theory that is modular in the analytical sense). It does not require 
that the articulation in the formal description of a network should exactly 
match the articulation of the original modular theory into subtheories. A 
system that is modular in the processing sense need not also be modular at the 
network level—the level of description in terms of units and connections—any 
more than it has to be modular at the physiological level. 

Indeed, it is not generally the case that a connectionist network is built from 
smaller component networks corresponding to the constituent subtheortes of a 
modular theory. 

This is not to say that connectionism is committed to the extreme view that 
there is a single giant network, responsible for all cognitive processes. On the 
contrary, it is explicit in the work of PDP theorists that specific tasks may be 
assigned to distinct networks, and that this amounts to an element of 
modularity (Hinton, McClelland and Rumelhart [1986], p. 79): 


A system that uses distributed representations still requires many different 
modules for representing completely different kinds of thing at the same time. 
The distributed representations occur within these localized modules. For 
example, different modules would be devoted to things as different as mental 
images and sentence structures . . . 


There is a potentially misleading mention of ‘localized modules’ in this passage: 
we should not confuse modularity at the network level with the physiological 
notion of modularity. Rather, the point is that localization or physiological 
modularity, requires network modularity (though the converse does not hold). 
To the extent that there is evidence of neural localization of cognitive 
functions, this is still consistent with the connectionist programme; for that 


550 Martin Davies 


programme already includes an element of modularity at the network level. 

Here it is useful to employ the idea of nested modules, and of coarser and 
finer grains of modularity. At the analytical level, a theory may be composed of 
subtheories which themselves have a modular structure. Indeed, we could 
think in terms of a massive psychological theory, whose subject matter is the 
whole of cognition, and which is composed of relatively independent 
subtheories concerning particular cognitive functions. The idea of the 
language faculty ‘with its specific properties, structure, and organization, one 
"module" of the mind’ (Chomsky [1986], pp. 12-13) is a reflection at the 
processing level of this coarse grained analytical modularity. A component 
theory, concerning a particular aspect of cognitton—such as language, or 
vision—1may itself be modular; and we can pursue this finer grained 
modularity through the various levels of description of a cognitive system. 

The connectionist programme is committed to some coarse grained 
modularity at the network level; but it is not committed to modularity at the 
network level matching any finer grained modularity at the analytical level. 
(Cf. the discussion of propositional modularity in Ramsey, Stich and Garon [to 
appear].) 

According to the story that we have told so far, connectionism's character- 
istic departure from modularity at the network level is compatible with 
descriptions of PDP systems as embodying tacit knowledge of modular 
theories. The remaining two sections of the paper will call that compatibility 
into question. 


4 SYNTAX 


The formal articulation of a connectionist network need not—and typically 
does not—reflect the articulation in the interpreted description of the system at 
the tacit knowledge level or processing level. This is why connectionism 
departs from fine grained modularity at the network level of description. 

If we now focus on patterns of activation as vehicles of semantic content, 
then we can see another consequence of the mismatch between the tacit 
knowledge description and the network description. The articulation in the 
network description of a connectionist system is in general not the syntactic 
articulation that is characteristic of symbol manipulation. 

As we have already noted, the symbol manipulation paradigm does not 
require explicit representation of computational procedures. So the relevant 
issue is not whether patterns of weights amount to syntactic encodings of 
tacitly known rules. But symbol manipulation does require syntactic structure 
in the representational states that lle in the domain of those procedures. The 
mismatch between the interpreted description and the uninterpreted descrip- 
tion of a connectionist network promises a sharp characterization of the 
difference between the two programmes. For, in general, the connectionist 


Connectionism, modularity, and tacit knowledge 551 


analogues of syntactic representations—namely, patterns of activation——are 
structured, but not syntactically structured. 

Someone might object to this characterization of the difference. It might be 
said that the symbol manipulation paradigm can itself recognize levels of 
description lying between the level of uninterpreted, syntactic, description and 
the phystological level. Nothing so far shows that the level of formal 
description of a connectionist network is anything other than one such 
intermediate level. 

There are a number of correct points here. A pattern of activation in units is 
a structured item; and there is nothing in the idea of such a pattern, as such, 
which prevents it from having a syntactic description. What is more, the way 
in which one pattern of activation leads to another pattern at a later time can 
be specifled without reference to what the patterns of activation mean; it is 
precisely so specified in the formal description of the network. So, transitions 
between patterns of activation meet a familiar formality condition upon 
symbol manipulation. Furthermore, it is possible that a system for symbol 
manipulation should have a description—at a level lower than the syntactic 
level—as a network of connected units. 

But none of this adds up to an argument for regarding patterns of activation 
as such as syntactically structured. 

According to Fodor, syntax must meet three conditions. First, ‘The syntax of 
a symbol ts one of its higher-order physical properties’ (Fodor [1987], p. 18). 
Second, syntax is systematically related to semantics. Third, syntax is a 
determinant of causal role (ibid. pp. 16-21). 

We can agree that the structure in a complex pattern of activation meets the 
first and third of these conditions. But, the constituent features of a pattern of 
activation—namely, specific levels of activation at individual units—need not, 
and typically do not, make a systematic contribution to the semantic content of 
the overall pattern; they are not like words in a natural language sentence. 

The articulation within a pattern of activation does not constitute a 
syntactic structure, so long as the interpreted description is afforded by the 
processing or tacit knowledge level. 

Itis possible to introduce a different level of interpreted description, lining up 
more neatly with the uninterpreted, formal, network description. This level is 
sometimes called the subconceptual level; the processing level is then called the 
conceptual level. (See Smolensky [1988], p. 3. The terminology is not ideal since 
it may suggest that tacit knowledge involves conceptualization; see again 
Davies [1989].) The main difference between these two levels of interpreted 
description is this. The concepts used in the conceptual level description are the 
primitive concepts deployed in the theoretical characterization of the task at 
the analytical level. In contrast, the description at the subconceptual level is in 
terms of microfeatures. 

A consequence of this semantic dimension shift (Smolensky's phrase: [1988], 


552 Martin Davies 


p. 11) between the conceptual and the subconceptual description is that, while 
the subconceptual interpreted description is a genuinely accurate semantic 
description of the operation of the network, the conceptual description is an 
approximation. This consequence calls for a modification to the idea that a 
pattern of excitation is straightforwardly a vehicle of semantic content. 

Suppose that we consider a family of states whose (conceptual level) 
interpreted descriptions have something in common. Perhaps, they are all 
states whose contents concern coffee. Or (recalling the language L in Section 
1), they might all be states whose contents concern sentences containing the 
predicate.‘F’. 

The original idea about bearers of semantic content would suggest that the 
states in such a family involve a common subpattern of activation which has 
an interpreted description as being about coffee, or being about a sentence 
containing the predicate ‘F’. But really this is not so, as Smolensky ([1988], 
p. 17) makes explicit: 


These constituent subpatterns representing coffee in varying contexts are activity 
vectors that are not identical, but possess a rich structure of commonalities and 
differences (a family resemblance, one might say). 


Similarly, if a connectionist network were to perform the task of assigning a 
meaning (specified in some format) to each sentence of L, then the constituent 
subpatterns representing the presence of the predicate 'F' in the varying 
contexts provided by the sentences ‘Fa’, ‘Fb’, and so on, would not be identical. 

The argument at the beginning of this section showed that the articulation 
within a pattern of activation does not constitute syntactic structure, given 
that the semantic description is cast in the same terms as the theory at the 
analytical level. Someone might have responded to that argument with the 
suggestion that we develop a level of syntactic description by taking certain 
subpatterns of activation to be the primitive syntactic items corresponding to 
the primitive concepts that are employed in the theory at the analytical level. 
Because patterns of activation are simply superimposed, this would have been 
a rather weak suggestion; it would not even preserve the idea of the order of 
constituents in a syntactically complex expression. But, in any case, we can 
now see that the suggestion would not work. For there is no single pattern of 
activation corresponding to each primitive concept; and so there are no 
candidates for the role of syntactic primitive. 


5 TACIT KNOWLEDGE AGAIN 


We can now draw an important consequence for the attribution of tacit 
knowledge to connectionist systems. Recall once again, the example in 
Section 1 and the two semantic theories T, and T;. 

Suppose that a network that involves a dimension shift between its 


Connectionism, modularity, and tacit knowledge 553 


conceptual and subconceptual descriptions succeeds in assigning the correct 
truth conditions to all the sentences of L. In particular, it assigns the correct 
truth conditions to all the sentences containing the predicate ‘F’: ‘Fa’ is true iff 
John is bald, ‘Fb’ is true iff Harry is bald, and so on. 

Suppose too, that this network is not simply made up of a collection of 
completely autonomous subsystems, one for each of the sentences containing 
'F'. Then we do not have a straightforward instance of tacit knowledge merely 
of the listiform theory Tı. 

Nevertheless, it need not be the case that there is a single pattern of weights 
on connections which is a causal common factor in all these transitions from 
representations of sentences to representations of meanings. For a typical 
connectionist system, we shall be able to say only that the approximate 
equivalence of the patterns corresponding to the predicate ‘F’ in varying 
contexts results in a considerable overlap in the patterns of connection weights 
implicated in the several transitions. Consequently, it will not be strictly 
correct to attribute to the network tacit knowledge of the articulated semantic 
theory T;. Such an attribution will be, at best, approximately correct. 

This result generalizes. Typically, PDP systems do not strictly embody tacit 
knowledge of modular theories. 

It is not an accident that the absence of accurate tacit knowledge 
attributions, and the absence of syntactic structure, go in step here. Tacit 
knowledge does not have to be explicitly represented; it can be realized by the 
presence of a processor. But tacit knowledge is a matter of strict causal 
systematicity in the transitions mediated by that processor—causal systemati- 
city mirroring the derivational systematicity in the theory that is tacitly 
known. And the way to incorporate that causal systematicity is to provide, for 
the states which are inputs to the processor, a physical articulation or 
structure which is systematically related both to the interpreted descriptions of 
those states and to the causal transitions to which the states lead. Given the 
three conditions upon syntactic descriptions, what this amounts to is 
providing the input states with a syntax. (For arguments from causal 
systematicity of process to syntactically structured representations, see Fodor 
[1987], pp. 135-54, and Fodor and Pylyshyn [1988].) 


6 CONCLUSION 


What prevents even the most rudimentary syntactic articulation in the states 
of a connectionist network is the dimension shift between the conceptual and 
subconceptual level. Given such a shift, the terms deployed in the theory at the 
analytical level do not flgure in any accurate interpreted description of the 
network. 

It is arguable that the apparent empirical inadequacy of some connectionist 


554 Martin Davies 


models is attributable to this attempt to do without resources which are, in 
fact, crucial; namely, the categories used in the classical theoretical characteri- 
zation of the cognitive task in question. (For this issue, see Rumelhart and 
McClelland [1986] and Pinker and Prince [1988].) 

More generally, strict causal systematicity of the kind required for tacit 
knowledge presents a problem for the connectionist programme. For the 
absence of a syntactic level of description is characteristic of connectionism. 
But causal systematicity requires syntactically structured representational 
states.! 

Philosophy Department 
Birkbeck College 

Malet Street 

London WC1E 7HX 


REFERENCES 


CHOMSKY, N. [1965]: Aspects of the Theory of Syntax. Cambridge, Massachusetts: MIT 
Press. 

Cyomsky, N. [1976]: Reflections on Language. London: Fontana/Collins. 

Cuomsky, N. [1980]: Rules and Representations. Oxford: Blackwell. 

Cuomsky N. [1986]: Knowledge of Language: Its Nature, Origin and Use. New York: 
Praeger. 

Cuomsky, N. [1988]: Language and Problems of Knowledge. Cambridge. Massachusetts: 
MIT Press. 

CHURCHLAND, P. M. [1988]: ‘On the nature of theories: A neurocomputational 
perspective’, in Minnesota Studies in the Philosophy of Science, Volume 14. 
Minneapolis: University of Minnesota Press. 

CoLTHEART, M. [1985]: ‘Cognitive neuropsychology and the study of reading’, in M. I. 
Posner and O. S. M. Marin (eds.), Attention and Performance XI, pp. 3-27. London: 
Erlbaum. 

Davis, M. [1987]: Tacit knowledge and semantic theory: Can a five per cent difference 
matter?' Mind, 96, pp. 441-62. 

Davies, M. [1989]: ‘Tacit knowledge and subdoxastic states’, in A. George (ed.), 
Reflections on Chomsky, pp. 131-52. Oxford: Blackwell. 

Evans, G. [1981]: ‘Semantic theory and tacit knowledge’, in Collected Papers, pp. 322- 
42. Oxford: Oxford University Press (1986). 

Fopor, J. [1983]: The Modularity of Mind. Cambridge, Massachusetts: MIT Press. 

FopoR, J. [1985]: 'Fodor's guide to mental representation: The intelligent Auntie’s vade- 
mecum', Mind, 94, pp. 77-100. 

FopoR, J. [1987]: Psychosemantics. Cambridge, Massachusetts: MIT Press. 


! An earlier version of this paper was written while I was visiting the Research School of Social 
Sciences, Australian National University in late 1987, and was presented at the conference 
Cognition et Connaissance held in Toulouse, in March 1988. I am grateful to Ned Block for 
comments on a more recent version. 


Connectionism, modularity, and tacit knowledge 555 


Fopor, J. AND PYLYSHYN, Z. [1988]: 'Connectionism and cognitive architecture: A 
critical analysis’, Cognition, 28, pp. 3-71. 

Hinton, G. E., MCCLELLAND, J. L. AND RUMELHART, D. E. [1986]: ‘Distributed 
representations', in D. E. Rumelhart, J. L. McClelland and the PDP Research Group, 
Parallel Distributed Processing, Volume 1, pp. 77-109. Cambridge, Massachusetts: 
MIT Press. 

Manz, D. [1982]: Vision. New York: W. H. Freeman and Co. 

PEACOCKE, C. [1986]: ‘Explanation in computational psychology: Language, perception 
and level 1:5', Mind and Language, 1, pp. 101-23. 

PINKER, S. AND PRINCE, A. [1988]: ‘On language and connectionism: Analysis of a 
parallel distributed processing model of language acquisition’, Cognition, 28, 73- 
193. 

Quine, W. V. O. [1972]: ‘Methodological reflections on current linguistic theory’, in D. 
Davidson and G. Harman (eds.), Semantics of Natural Language, pp. 442-54. 
Dordrecht: Reidel. 

Ramsey, W., SriCH, S. AND GARON, J. [to appear]: ‘Connectionism, eliminativism, and 
the future of folk psychology'. 

RUMELHART, D. E. AND MCCLELLAND, J. L. [1986]: ‘On learning the past tenses of English 
verbs’, in J. L. McClelland, D. E. Rumelhart and the PDP Research Group, Parallel 
Distributed Processing, Volume 2, pp. 216-71. Cambridge, Massachusetts: MIT 
Press. 

SMOLENSKY, P. [1987]: ‘Connectionist AI, symbolic AI, and the brain’, Artificial 
Intelligence Review, 1, pp. 95-109. 

SMOLENSKY, P. [1988]: ‘On the proper treatment of connectionism’, Behavioural and 
Brain Sciences, 11, pp. 1-74. 

Sticu, S. [1983]: From Folk Psychology to Cognitive Science. Cambridge, Massachusetts, 
MIT Press. 


Brit. J. Phil. Scl. 40 (1989), 557-567 Printed in Great Britain 


REVIEW 


HOOKWAY, CHRISTOPHER [1988] 
Quine: Language, Experience and Reality. 
Stanford: University Press. xii-- 227 pp. $35.00 cloth, $11.95 paper 


ROGER F. GIBSON, JR. 
Washington University 


Christopher Hookway's new book is a well written and eminently readable 
introduction to the systematic philosophy of W. V. Quine, 'In line with the aims 
of the series' (p. x) in which this book occurs, Hoókway succeeds brilliantly in 
his attempt 'to make the book accessible to non-philosophers and to students' 
(p. x). Even so, the specialist in Quine's thought, too, will find much of the book 
very insightful and informative. 

The central premise of the book is that while Quine has effectively criticized 
and rejected the semantical foundation upon which logical empiricism was 
based, he has not rejected some of its key doctrines, namely: empiricism (‘All 

‘our knowledge of external reality comes through the senses’ (p. 3)); scientism 
(‘the only real-knowledge is scientific knowledge’ (p. 3)); and physicalism (‘the 
universe is, fundamentally, a physical system’ (p. 3)). The central conclusion 
of the. book is that Quine's ‘greatest philosophical contribution has probably 
been to develop, in a consistent and rigorous fashion, the consequences of a set 
of assumptions [viz., empiricism, sclentism, and physicalism,] whose appeal 
cannot be denied even by those philosophers who reject them' (p. 3). Putting 
his conclusion slightly differently, Hookway says, 'it seems to me that the 
importance of Quine's work lies in the fact that he has worked through what is 
involved in a physicalist empiricism more thoroughly and rigorously than any 
other post-positivist philosopher' (p. 219). Hookway's use of 'important' in this 
context is purposely ambiguous as between 'important because Quine's 
conclusions vindicate his assumptions’ and ‘important because Quine's 
conclusions repudiate his assumptions’. However, by the time the reader gets 
to the end of the book he/she will have acquired a healthy suspicion that 
Hookway prefers the latter construal to the former. 

Hookway weaves his way from his premise to his conclusion through some 
twelve chapters, comprising the four parts of the book. In the first part, he 

-‘examines the views defended in From a Logical Point of View, and introduces 
the sources of Quine’s naturalism’ (p. 3). In the second part, he ‘explains the 
metaphysical and logical doctrines which determine the character of many of 
[Quine’s] views, and which come to the fore in Word and Object’ (p. 3). In the 


558 ` Review 


third part, he examines Quine’s doctrine of ‘indeterminacy of translation, and 
compare[s] Quine's views with those of ... Donald Davidson’ (p. 3). In the 
fourth part, he evaluates 'Quine's physicalist naturalism and his empiricism’ 
(p. 3). In what follows, I shall organize my comments around each of these four 
parts of Hookway's book. 

Part I: The Evolution of Empiricism. In the three chapters that comprise Part I, 
Hookway cleverly exploits the five milestones of empiricism (taken from 
Quine's essay of the same name) to introduce some of Quine's views found in 
From a Logical Point of View. "The "milestones" to which Quine refers all 
involve developments in our philosophical understanding of representation: 
they promise philosophical enlightenment by overthrowing entrenched, but 
mistaken, conceptions of how thought and language work' (p. 8). The first 
milestone is the shift from concern with ideas to concern with words; the 
second is the shift from taking words as the focus of semantics to taking 
sentences as the focus; the third is the shift from taking sentences as the units of 
empirical significance to taking substantial chunks of theories as such units, 
Le., moderate holism; the fourth is methodological monism, 1.e., the rejection of 
the analytic-synthetic distinction; the fifth, and final milestone of empiricism, 
is naturalism, i.e., the abandonment of first philosophy. 

Hookway does an admirable job of spelling out the connections between 
each of the milestones of emptricism and various Quinian doctrines. Here are 
some examples: the second milestone (viz., the primacy of sentence meaning) 
convinces us that words are not, in general, names; that meaning is not 
reference; that universals need not be posited as the meanings of general terms; 
that linguistic significance and synonymy are not explained by positing 
meanings; that philosophical analysis, explication, makes no synonymy 
claims; that to be is to be the value of a bound variable; and that reference 1s 
inscrutable. The third milestone (viz., holism) impels us to scuttle both the 
analytic-synthetic distinction and epistemological reductionism, and it serves 
as a premise in the argument for indeterminacy of translation. The fourth 
milestone (viz., methodological monism) encourages a new empiricist account 
of alleged a priori knowledge, and it opens the way for a rapprochement of 
philosophy and science. The fifth milestone (viz., naturalism), whose sources 
are holism and an attitude of unregenerate realism, offers new hope for a 
successful epistemology. 

Hookway's exegesis in Part I is guided by the plausible claim that 'Quine's 
work must be understood against the background of logical empiricism, in 
particular [against] the views of Rudolf Carnap' (p. 26). Hookway does a good 
job of explaining Carnap's and Quine's contrasting views regarding putative 
linguistic rules and whether they are useful in explaining linguistic under-: 
standing, scientific rationality, and alleged a priori knowledge. After explain- 
ing how the internal-external distinction emerged in Carnap's evolving 
thought, Hookway emphasizes that Carnap, no less than Quine, recognized 


The British Journal for the Philosophy of Science 559 


the fact that holism occurs. But while this recognition moved Quine to 
embrace the fourth milestone of empiricism (viz., methodological monism), 
Carnap remained steadfast. He simply did not regard holism as undercutting 
either the internal-external distinction or the analytic-synthetic distinction. 
‘The puzzle’, says Hookway, ‘is less why Quine made this move than why 
Carnap didn’t’ (p. 37). Indeed, the question has been no less puzzling to Quine 
himself. When asked that very question at a conference on his philosophy held 
in St. Louis in April 1988, Quine said that he could not account for it, that the 
matter has always puzzled him. From the audience, Burton Dreben volun- 
teered that never for a moment did it ever occur to Carnap that logic and 
mathematics were anything but analytic, and J. Richard Creath added that 
both Duhem and Poincaré (cited by Carnap as sources) regarded holism as 
applying only to theoretical physics, not across the board. Perhaps these 
remarks shed some explanatory (psychological?) light on Carnap's intransi- 
gence. 

Part II: Logic and Reality. In the four chapters (4, 5, 6, and 7) that comprise 
Part Hl, Hookway examines the natures of, and some arguments for and 
against, 'the sources of Quine's austere conception of reality' (p. 63), viz., 
physicalism and extensionalism. 

He distinguishes three physicalist theses in Quine; they are: (1) the events 
that fall under the laws of any science are physical events and thus fall under 
the laws of physics; (2) the physical facts are all the world's facts; (3) only 
physical explanation is real, scientific, explanation. 

Hookway is uncomfortable with all three of Quine's physicalist theses. His 
discomfort with thesis (1) is rooted in the problem of determining just what is 
to count as a physical fact. In particular, he wonders how this question can be 
answered in a non-trivial, non-arbitrary way. Quine's own way appears to be 
historical: a physical fact is whatever present or future physicists say it is. But 
why should we accept physicists’ pronouncements on the matter? ‘Simply 
because it is the business of theoretical physics, and of no other branch of 
science, “to say that ... minimum catalogue of states would be sufficient to 
justify us in saying that there is no change without a change in positions or 
states (‘Facts of the Matter’)’’.”} 

Even if this answer, or something like it, is accepted, Hookway thinks that 
Quine slides too glibly from (1) to (2) (and (3)). He poses the difficulty as 
follows: 


Suppose that somebody accepted Quine’s determinationist thesis [(1)], but 
attempted to deny [(2)] that the physical facts are the only ultimate ones. In order 
to understand the ultimate structure of reality, they might argue, we need to 
know about more than the ultimate events there are, and the physical laws 


1 W. V. Quine, ‘Reply to Hilary Putnam’, in Lewis Edwin Hahn and Paul Arthur Schilpp (eds.), 
The Philosophy of W. V. Quine. La Salle, Illinois: Open Court, 1986, p. 430. 


560 Review 


governing their behaviour. We also need to know how they are to be classified, 
the fundamental kinds into which they fall. In that case, there are structural 
features of reality which we shall not know until we have developed economic 
theory; there are classifications which reveal structure but which cannot be 
identtfled by physics. Hence, Quine's physicalism [(1)] does not entail [(2)] that it is 
physics alone that describes ultimate structure. (pp. 76-7, my emphasis) 


The point is, even if one accepts the determinationist thesis, why should one 
accept the view that the physical facts are all the facts. Why aren't there facts 
in other scientific domains, e.g., in economics? 

The difficulty that Hookway has with accepting Quine's claim that the 
physical facts are all the facts is one which he shares with Richard Rorty. 
Hookway and Rorty see the issue in terms of a question regarding objectivity: 
why should one agree with Quine that physics is objective and everything else 
is subjective? Since Rorty construes factuality in terms of objectivity, and 
objectivity in terms of agreement, and since he sees no good reason for 
denying, in principle, that agreement is possible in economics (or even literary 
criticism, for that matter), he sees no good reason for denying that economics 
(or whatever) could become just as factual as physics. Hookway suggests a 
similar view when he writes: 


If talk of values, or beliefs, or meanings is necessary for us to understand our 
practices and our relations to them; and if there are means for progressing 
towards agreement on controversial issues about meanings, or values or 
beliefs—then values, beliefs and meanings are real. There may be room for a more 
relaxed realism which contrasts with Quine's austere conception of réality. (p. 216, 
my emphasis) 


Hookway acknowledges that the proper Quinian response to this challenge is 
to claim 'that the kinds discerned by economists do not represent part of 
"ultimate structure", but reflect classifications determined by our practical 
interests and concerns' (p. 77). Only the language of physics is fully factual 
and, therefore, fully explanatory. Other domains of discourse are not fully 
factual and, therefore, are not fully explanatory. Some domains are better off in 
this regard than are others, however; for example, behavioristic talk is better 
off than mentalistic talk and, therefore, so-called behavioristic explanation is 
closer to real, physical, explanation than is so-called mentalistic explanation. 
Whether this Quinian response can be rendered entirely adequate is a question 
which Hookway never explicitly answers (but see pp. 215-16). However, one 
gets the distinct Impression that he does not believe that it can be. Rather, he 
seems to be much more at home with a version of Goodmanian pluralism than 
with what he perceives as Quinian ‘austerity’. Indeed, the reader need not be 
acute of ear to hear the palpable sigh of relief accompanying Hookway’s 
assurance that 'Quine's physicalism and extensionalism seem to be indepen- 
dent of the doctrines discussed in Part I [viz., those associated with the five 
milestones of empiricism]' (p. 124). 


The British Journal for the Philosophy of Science 561 


. Hookway's discussion of Quine's extensional canonical idiom (chapter 5) is 
very clear but elementary. His discussion of intensionality (chapter 6) is a 
veritable masterpiece of exposition. Not only does he demonstrate the 
intensionality of various idioms, for example, of tense, belief, necessity, 
probability, dispositions, subjunctive conditionals, etc., he clearly explains 
Quine's response to each. Furthermore, along the way he provides a good 
account of Quine's notion of physical object. Finally (chapter 7), he introduces 
the reader to de dicto and de re modalities, to quantifled modal logic, and to 
Quine's objections to quantifled modal logic which focus on essentialism and 
cross-world identity. 

All in all, Part II is an enlightening and useful discussion of Quine's 
physicalism and extensionalism, the two sources of Quine's austere conception 
of the world. Even so, allusions to Quine's 'taste for desert landscapes' (p. 117) 
notwithstanding, Hookway's ubiquitous references to 'Quine's austere con- 
ception of the world' may well strike the reader as somewhat hollow, what 
with Quine's profligate notion of physical object and his quantifying over 
classes, classes of classes, and so on up. But this reaction misses Hookway's 
point. His pointis that the world may contain all of Quine's kinds of objects, but 
it also contains other kinds of objects. In short, he thinks that Quine's 
physicalism draws the fact/nonfact (read: objective/subjective) distinction in 
too lopsided a manner. Furthermore, he is skeptical of the desirability, the 
feasibility, and even, perhaps, the intelligibility of the ideal of physics providing 
a final account of what there is. 

Part III: Mind and Meaning. The three chapters (8, 9, and 10) constituting 
this part of Hookway's book focus on Quine's controversial thesis of 
indeterminacy of translation. 

In chapter 8, after explaining Quine's Indeterminacy thesis, Hookway 
examines two lines of argumentation that Quine uses to support his thesis, one 
line which derives from the second milestone of empiricism (viz., the primacy 
of sentence meaning) and another line which derives from the third milestone 
(viz., holism). He then spells out what he takes to be the dire philosophical 
implications of indeterminacy, first for mind and meaning, then for reference, 
ontological commitment, and truth. He concludes the chapter by wondering 
whether the philosophical consequences of indeterminacy do not 'constitute a 
reductio ad absurdum of something. ... Do we have a refutation of a certain 
picture of philosophy, of a philosophical conception of fact, of physicalism, of a 
distinctive view of what physicalism involves, or of a particular account of 
which physical facts are relevant to claims about mind and meaning?' (pp. 
144-5). Chapter 10 trains the reductio more specifically on Quine's physica- 
lism and empiricism, and ‘the remaining chapters [in Part IV] attempt to 
evaluate these fundamental commitments’ (p. 167). 

The first of Quine’s arguments that Hookway examines in chapter 8 claims 
that stimulus meanings of sentences (such as 'Gavagal') leave questions 


562 Review 


regarding the reference of the terms which the sentences might contain with 
no determinate answers. According to Quine, this argument ‘illustrates the 
indeterminacy of translation only of terms, not of sentences. . . . But my thesis 
of indeterminacy of translation applies first and foremost to sentences... .’? 
Thus, this first line of argumentation is best understood as attempting to 
establish inscrutability of reference and not indeterminacy of sentence 
translation. Hookway’s discussion tends to conflate these two much of the 
time. Keeping the two distinct is more than mere pedantry, however, for Quine 
regards the inscrutability thesis, but not the indeterminacy thesis, to be 
scarcely debatable. Furthermore, only lately his Quine ‘appreciated how fully 
... [proxy functions] support the thesis of inscrutability of reference and how 
much clearer that thesis becomes when propounded independently of the 
indeterminacy of translation."? 

The second line of argumentation that Hookway examines in chapter 8 
derives from moderate holism: ‘A translation manual provides a kind of 
description of an overall pattern of dispositions to verbal behaviour. Owing to 
the holistic character of these dispositions—each reflects the impact of many 
beliefs—different systematic descriptions are possible which fit all the facts’ (p. 
135). In chapter 10, Hookway develops this argument further. Following 
Fodor, he differentiates evidential holism, i.e., the Duhem thesis (pertaining to 
evidence) from semantic holism (pertaining to meaning), and he seems to 
agree with Fodor that semantic holism is a rather illegitimate offspring of an 
upright evidential holism (see p. 166). Hookway rightly remarks that anyone 
‘who ts already committed to an empiricist theory of meaning, claiming that 
the meaning of a sentence Is constituted by what would count as evidence for 
or against it, will not see much difference between evidential and semantic 
holism' (p. 166). Hookway does not name names here, but it is pretty clear that 
Quine's is on the tip of his tongue. The suggestion seems to be that Quine's 
commitment to semantic holism is unreasoned, merely ad hoc and dogmatic— 
something that Quine imbibed while sitting at Carnap's knee. 

If Hookway sees. Quine in this light, it is due, perhaps, to his overlooking 
Quine's explicit commitment to verificationism: ‘If we recognize with Peirce 
that the meaning ofa sentence turns purely one what would count as evidence 
for its turth [Le., verificationism], and if we recognize with Dubem that 
theoretical sentences have their evidence not as single sentences but only as 
larger blocks of theory [i.e., evidential holism], then the indeterminacy of 
translation of theoretical sentences is the natural conclusion.'4 If this 


2 W. V. Quine, ‘Indeterminacy of Translation Again’, The Journal of Philosophy 84 (January 
1987), p. 8. 

? W, V. Quine, ‘Reply to Paul A. Roth’, in Hahn and Schilpp, eds.. p. 460. 

* W. V. Quine, Ontological Relativity and Other Essays, pp. 80-1. New York: Columbia University 
Press, 1969. 


The British Journal for the Philosophy of Science 563 


quotation leaves a lingering doubt as to whether Quine would really call 
himself a verificationist, the following quotation should erase all doubt: 


Gibson cites Fellesdal's interesting observation that the indeterminacy of 
translation follows from holism and the verification theory of meaning. Fellesdal 
mistrusts this defense because of doubts about verificationism, and I gather that 
Gibson agrees. But I find it attractive. The statement of verificationism relevant to 
this purpose is that ‘evidence for the truth of a sentence is identical with the 
meaning of the sentence’; and I submit that tf sentences in general had meanings, 
their meanings would be just that. It is only holism itself that tells us that in 
general they do not have them.5 


But if Quine is a verificationist, is his commitment to verificationism 
unreasoned, ad hoc and dogmatic, as Hookway’s remarks seem to suggest? No; 
and showing that it is not also provides a partial response to Hookway’s 
question of whether indeterminacy and its philosophical consequences do not 
function as a reductio ad absurdum of something: ‘Should the unwelcomeness of 
the conclusion [i.e., indeterminacy,] persuade us to abandon the verification 
theory of meaning? Certainly not. The sort of meaning that is basic to 
translation, and to the learning of one’s own language, is necessarily empirical 
meaning and nothing more.’® Thus, it is the emptrical conditions of the settings 
of radical translation and ordinary language learning which provide the 
support for Quine's verificationism. 

Furthermore, Quine not only denies that indeterminacy refutes his 
verificationism, he denies that it refutes his behaviorism: 


Critics have said that the thesis is a consequence of my behaviorism. Some have 
said that it is a reductio ad absurdum of my behaviorism. I disagree with the second 
point, but I agree with the first. I hold further that the behaviorist approach is 
mandatory. In psychology one may or may not be a behaviorist, but in linguistics 
one has no choice.? 


In chapter 9, Hookway again takes up the thesis of inscrutability of reference. 
He discusses the differences between Quine's notion of a translation manual 
and Gareth Evans' notion of a theory of meaning. He also explains their 
different conceptions of psychological explanation and of the relevance of such 
explanations to linguistics. The issues here are difficult and complex, and I 
shall not do Hookway the injustice of trying to summarize his discussion of 
them. However, I do commend Quine's ‘Reply to Gilbert Harman’? to any 
reader desirous of further clarification of Quine's views on the issue of linguistic 
performance being guided by rules. 


5 W. V. Quine, 'Reply to Roger F. Gibson, Jr.', in Hahn and Schilpp, eds., pp. 155-6. 

5 Quine, Ontological Relativity and other Essays, p. 81. 

7 W. V. Quine, ‘Indeterminacy of Translation Again’, p. 5. 

8 W. V. Quine, ‘Reply to Gilbert Harman’, in Hahn and Schilpp, eds., pp. 181-8. See especially p. 
186. 


564 Review 


In chapter 10, Hookway again takes up the connection between holism and 
indeterminacy, some aspects of which I have already noted. However, the bulk 
of chapter 10 is devoted to examining the views of Donald Davidson, ostensibly 
because such an ‘examination of how the Quinean framework is transformed 
in Davidson’s hands will help us to understand how it can be challenged’ (p. 
167). Further discussion of differences between Quine and Davidson occurs in 
section 12.5. 

Part IV: Knowledge and Reality. In the two chapters (11 and 12) that 
constitute the final part of his book, Hookway delves more deeply into the 
nature of Quine’s general philosophical undertaking. In chapter 11, he 
questions whether Quine’s naturalized epistemology really is epistemology 
and whether Quine’s commitment to empiricism undercuts his commitment to 
realism. In chapter 12, Hookway explains how Quine consistently maintains 
both realism and empiricism, he expresses his doubts about Quine’s claim that 
the physical facts are all the facts, and he sketches a Davidsonian alternative to 
Quine’s empiricism. 

Hookway explains in chapter 11 how traditional epistemology, since 
Descartes, has been motivated by the skeptic’s challenge to show that 
knowledge of the external world is possible. Assuming the skeptic’s challenge 
is itself unproblematic, a successful epistemology would be one that found 
grounds for justifying such knowledge claims, grounds which, on pain of 
vicious circularity, are not to be part of the body of knowledge being justified. 
Rather, they somehow or other must be self-justifying. Quine calls such 
epistemology first philosophy, and his naturalism repudiates it: there is no first 
philosophy. Are we, then, left with skepticism? 

Quine underscores the fact that not only is epistemology motivated by 
skepticism but skepticism is motivated by knowledge, scientific knowledge of 
the external world—the very knowledge that the skeptic is seeking to refute. 
For example, in order for the skeptic's argument from illusion to get off the 
ground, it must rely on a distinction between reality and illusion. What is 
thought to be real is not. Quine does not regard this use of science by the skeptic 
in his/her attempt to refute science as illegitimate; rather, it ts merely an 
instance of argument by reductio ad absurdum. Well and good for the skeptic; 
but if there is no first philosophy, then what role, if any, is left to epistemology? 

Epistemology goes on, but since the skeptical challenge to science arises 
from within science, epistemology goes on within science. The new epistemo- 
logist is out to defend science from within, against its own self-doubts. Such a 
naturalized epistemology is both descriptive and normative, but it is not 
Justificatory in the way that the old-time epistemology aspired to be. Thus, the 
new epistemology is circular, but it is not viciously circular, for it has 
renounced the traditional goal of finding something firmer than science upon 
which to ground sclence. 


The British Journal for the Philosophy of Science 565 


A far cry, this, from old epistemology. Yet it is no gratuitous change of subject 
matter, but an enlightened persistence rather in the original epistemological 
problem. It is enlightened in recognizing that the skeptical challenge springs 
from science itself, and that in coping with it we are free to use scientific 
knowledge. The old epistemologist failed to recognize the strength of his 
position.? 


The problem with this line, Hookway notes, is that Quine claims both that the 
new epistemology gives up the traditional goal of first philosophy and that the 
new epistemology is no gratuitous change of subject matter. How can Quine 
maintain both claims consistently? 

Hookway approaches this issue through a discussion of Barry Stroud’s 
contention that Quine’s notion of naturalized epistemology either aspires to be 
real epistemology (because it attempts to answer the skeptic’s challenge) but 
fails, or it has no such aspiration and reduces to nothing more than the 
physiology of belief formation. If naturalized epistemology is no gratuitous 
change of subject matter it must, according to Stroud, attempt to answer the 
skeptic. However, Stroud contends, if Quine allows (as he does) that the 
skeptic’s use of science to refute science is legitimate, then the naturalized 
epistemologist, the defender of science from within, cannot use the now 
suspect scientific knowledge in fashioning his/her defense. Thus, if Stroud’s 
skeptic can show that science is inconsistent, then all of science is suspect. So, 
Quine’s naturalized epistemology cannot hope to refute the skeptic. As 
Hookway reports, Quine contends that Stroud's skeptic, who would throw the 
baby out with the bath water, is simply overreacting. 

Hookway suggests, rightly I believe, that the difference between Stroud and 
Quine over the nature of skepticism and over the question whether naturalized 
epistemology is real epistemology turns on differences in their attitudes toward 
the acceptability of the notion of a Ding-an-sich. Stroud's skeptic takes this 
notion seriously, Quine’s does not. From Stroud's transcendental perspective 
all of present science may be false because it may very well not correspond to 
the Ding-an-sich; from Quine’s immanent perspective all of present science may 
be false because future science may overturn it. But, as Hookway explains, 
Quine will have none of the transcendental doubt that motivates Stroud’s 
skeptic. All doubt, for Quine, is immanent, and of a plece with science itself. 

Hookway next focuses on a tension is Quine’s thought between his realism 
and his empiricism. Quine professes to be a scientific realist, but he also claims 
that all possible evidence underdetermines our theory of the world. Thus, 
given all this empirical slack, it is quite possible that many different theories of 
the world could do justice to all possible evidence. It would appear, then, that 
Quine's underdetermination thesis supports an anti-realist view of science. 


? W. V. Quine, The Roots of Reference, p. 3. La Salle, Ilinois: Open Court, 1974. 


566 Review 


Hookway’s resolution of this apparent tension in Quine’s thought, intro- 
duced on p. 200 and developed on pp. 207 f. (chapter 12), is accurate in respect 
to Quine’s current views, but it is not without its irony. For the passage from 
Quine’s Theories and Things which Hookway cites on p. 200 to support his 
interpretation is superceded in a later printing of Theories and Things by a 
passage designed to deny what the earlier passage asserts.!° The point at issue 
is whether Quine wants to say of two empirically equivalent theory 
formulations, which cannot be rendered logically equivalent by a reconstrual 
of predicates, but which have been rendered logically compatible by the trivial 
expedient of changing the spelling of terms contained in the conflicting 
sentences of the respective theory formulations, whether both theory formula- 
tions are true or only one is true. In the passage cited by Hookway, Quine says 
both are true. But in the later printing Quine alters the passage to say that only 
one is true. (Even in the original printing, but in a different essay, he says that 
only one is true.!!) Quine has come to refer to these two options as his 
ecumenical position and his sectarian position, respectively.!? Lately, how- 
ever, in conversation and soon to be In print, Quine has, under certain 
circumstances, again endorsed the ecumenical position. Thus, ironically, 
Hookway's account remains largely accurate. Irony or not, I think that 
understanding Hookway's excellent discussion given in his section 12.3, 
"Taking Our Own Physics Seriously'—possibly the most important section of 
his book—is absolutely crucial to gaining a proper understanding of Quine's 
philosophy. 

. Finally, Hookway discusses further Quine's physicalism and empiricism. He 
argues that Quine's physicalism leads to 'a narrow philosophical vision' (p. 
214), ie., a vision preoccupied with limning the most general traits of reality, 
that it may well undercut self-understanding by discrediting the language of 
self-description, e.g., folk psychology, and that even 1f the determinationist 
thesis is true it does not automatically follow that the physical facts are all the 
facts. He also explains how the empiricism, which supports Quine's (dubious?) 
dichotomy between the factuality of discourse in the area of physics and the 
lack of factuality of discourse in all other areas could be avoided by following in 
Davidson's footsteps: ‘Once the Davidsonian approach to interpretation is 
accepted, there is no basis for distinguishing areas of discourse as Quine does’ 
(p. 218). ‘But’, Hookway notes, ‘if this shows how there might be an 
alternative to Quinean empiricism, it does not yet refute it’ (p. 218). Hookway 
concludes his book by reflecting on the rationale for choosing between Quine’s 
and Davidson's approaches to philosophy. Reading somewhat between the 


10 W, V, Quine, Theories and Things, p. 29. Cambridge, Massachusetts: Harvard University Press, 
1981. See footnote 3 on p. 29 of the paperback edition of Theories and Things for 
acknowledgement of this change. 

11 See Quine, Theories and Things, pp. 21-2. 

12 See Quine, ‘Reply to Roger F. Gibson’. 


The British Journal for the Philosophy of Science 567 


lines, I think it is fair to conjecture that Hookway's own sympathies lie more 
with Davidson than with Quine. 

In sum, I believe that Hookway’s book is a significant contribution to the still 
burgeoning Quine literature. Not only does Hookway, through his skillful 
expository writing, succeed admirably in making Quine’s views accessible to 
non-philosophers and to students, he also, through his frequent insights into 
Quine's philosophy, sheds much light on several difficult matters of interpreta- 
tion which have caused no little debate among specialists in Quine's thought. 


The London School of Economics 
and Political Science 


LAKATOS AWARD 
IN PHILOSOPHY OF SCIENCE 


This Award is for an outstanding contribution to the philosophy of 
science in the form of a book published in English. The Award is 
endowed by the Latsis Foundation in memory of Imre Lakatos and 
is administered on behalf of the London School of Economics by a 
committee consisting of the Director of the School or his deputy as 
chairman, and Professors Hans Albert, Adolf Griinbaum, Alan 
Musgrave and John Watkins. The committee makes the Award on 
the advice of an independent and anonymous panel of selectors. 
The value of the Award is £10,000. In any given year the Award 
may be shared or no Award made. 


To receive an Award a successful candidate must visit the School 
and deliver a public lecture. The Award has been won by Bas Van 
Fraassen and Hartry Field (1987), Michael Friedman and Philip 
Kitcher (1988) and Michael Redhead (1989). 


Candidates must be nominated by: at least three people of 
recognized professional standing. Nominators should give their 
grounds for the nomination and indicate the candidate's age, since 
& preference may be given to a younger scholar, Three copies of the 
book should, if possible, be sent to the address below. All 
communications should be marked 'Lakatos Award' and addressed 
to: 


The Secretary, 

The London School of Economics, 
Houghton Street, 

London WC2A 2AE, 

England. 


The 1990 Award will be for a book published during the last six years 
(bearing an imprint from 1984 to 1990 inclusive). The closing date for 
nominations is 15 April 1990. 


BRITISH SOCIETY FOR THE PHILOSOPHY OF SCIENCE 


Annual Conference 1990 
Wolfson College Cambridge: September 21-23 1990 


P. GALISON (Stanford University) 
T, PINCH (University of York) 
Fe Experimental Science - Discovery versus Social 
Construction 


*. G. CAIRNS-SMITH (University of Glasgow) 
H. KAMMINGA (University of Cambridge) 
The Origin of Life 


J. POLKINGHORNE (University of Cambridge) 
^ A.O'HEAR (University of Bradford) 
Science and Religion 


H-P. DURR (Max Planck Institute, Munich) 
A. BRENNAN (University of Stirling) 
Ethics and the Environment 


Papers by Research Students 
» Further details will be announced shortly 


Conference Organisers: Professor M. Redhead 
Dr N. Jardine 
Department of History & 
Philosophy of Science 
Free School Lane 
Cambridge 
England. 


PROMETHEUS EXPLORES 
NEW FRONTIERS 


INTRODUCTORY READINGS 
IN THE PHILOSOPHY 
OF SCIENCE 
Revised Edition 
edited by E. D. Klemke, 

Robert Hollinger, and A. David Kline 
Containing 28 essays by the very best 
theorists in the field, this new edition 
opens with a crucial section on science 
and non-science, then it moves on to 
provide substantially revised sections 
on explanation and law, theory and 
observation, confirmation and accep- 
tance, science and values, and science 
and culture. 
480 pages € ISBN 0-87975-423-0 € Paper $18.95 


BUT IS IT SCIENCE? 

The Philosophical Question in the 
Creation/Evolution Controversy 
edited by Michael Ruse 
Collects the works of prominent evo- 
lutionists and creationists in an effort 
to draw more accurately the demar- 

cation between science and religion. 
“Exquisite philosophical, scientific, and legal 
detail... an interesting analysis of a 
controversy that just won't go away.” 
—Seience Books & Films 


406 pages (Illustrated) 

ISBN 0-87975-439-7 € Cloth $24.95 
READINGS IN THE 
PHILOSOPHICAL PROBLEMS 
OF PARAPSYCHOLOGY 
edited by Antony Flew 
“Collects some of the best philosophical writings 
on the subject to show the depths of confusion 
and illogic into which so much of para- 

psychology descends . . . thorough." 
—Contemporary Psychology 
369 


ISBN 0-87975-382-X @ Cloth $25.95 
ISBN 0-87975-385-4 € Paper $17.95 


THE SEARCH FOR 
PSYCHIC POWER 
ESP and Parapsychology Revisited 
C. E. M. Hansel 
With unrelenting scholarship and skill, 
Hansel focuses on the history of psy- 
chical research, analyzes the meth- 
odologies of these experiments, and 
finds substantial evidence of shoddy 
controls, inaccurate statements, in- 
consistent reporting, and even 
outright deception. 
350 

ISBN 0-87975-516-4 € Cloth $24.95 

ISBN 0-87975-533-4 € Paper $16.95 
SCIENCE AND EARTH HISTORY 
The Evolution/Creation Controversy 

Arthur N. Strahler 
Examines the creationists' claim of sci- 
entific evidence in light of the findings 
of mainstream science in the research 
fields of cosmology, astronomy, geo- 
physics, geology, paleontology, and 
evolutionary biology. 
“Arresting, captivating, and stimulating 
. a modern masterpiece." 
— Science Books & Films 
502 pages (Over 300 Illustrations) 85x11 
ISBN 0-87975-414-1 © Cloth $39.95 € 
PSEUDOSCIENCE AND 
THE PARANORMAL 
A Critical Examination of the Evidence 
Terence Hines 
A refreshing look at all the major (and 
minor) pseudoscientific and para- 
normal claims which are so prevalent 
today. 
“A wealth of anecdotes and historical back- 
ground... very readable as well as in- 
formative." —New Scientist 
372 pages (16 pages of Illustrations) 
ISBN 0-87975-419-2 € Paper $17.95 


PROMETHEUS BOOKS 
700 EAST AMHERST STREET è BUFFALO, NY 14215 
Available through the Prometheus United Kingdom Office 


10 Crescent View @ Loughton, Fssex 1G 10 4PZ— 





New in paperback 
LEVIATHAN AND THE AIR-PUMP 


Hobbes, Boyle, and the Pa perunenal Life 
Steven Shapin and Simon Schaffer ; 


“If any proof of the intellectual buoyancy 
or intrinsic worth of the history and 
philosophy for science was needed, noth- 
ing better could be provided than this 
study by Steven Shapin and Simon Schaf- 
fer. . . . Their findings suggest the futility 


of wrenching science from its ideological 
context, and not only with respect to the 
seventeenth century; they also detect 
parallels with the crisis of confidence 
affecting contemporary science.” 
—Charles Webster, Times Literary Supplement 
Now In paper: $16.95 ISBN 0-691-02432-4 


PRICES ARE IN U.S, DOLLARS 
ORDER FROM YOUR BOOKSELLER OR FROM 


PRINCETON UNIVERSITY PRESS 


ORDER DEPT., 3175 PRINCETON PIKE, e LAWRENCEVILLE, NJ 08648 U.S.A, 





e du m E 


OXFORD * UNIVERSITY * PRESS. 
e 


Publishers of the OXFORD ENGLISH DICTIONARY Second Edition 


Looking afresh at nature... 


Laws and Symmetry 
BAS C. VAN FRAASSEN 


The author analyses and rejects the arguments 
that there are laws of nature, or that we must 
believe that there are, arguing that we should 


discard the idea of law as an inadequate clue 
to science. 


0198248113, 416 pages, Clarendon Press 5i 
0 19 824860 1, paper covers £11.95 


che Wellborn Science 
ne in Germany, France, Brazil, 
Russia 


Edited by MARK B. ADAMS 


This book analyzes the eugenics movement in each 
of these countries, examining the scientific 


. componentofthe programmes, and showing how 


social, religious, and political forces si 
altered the original scientific goals. 


ilicantly 


. Monographs on the History and Philosophy of Biology 
. 019 505361 3, 256 pages, OUP USA £28.00 


Ideas of Space 
Euclidean, Non-Euclidean, 
and Relativistic 

Second Edition 


JEREMY GRAY 

This updated edition contains an additional 
chapter on the Arabic contribution to this 
fascinating topic in the history of mathematics. 
"This promises to become a classic text for those interested 
in considering changing mathematical perceptions of 


an Gray’s book is a pleasure to read.’ 
; toria Mathematica 

019 853935 5, 254 pages, numerous line drawings, 
* Clarendon £35.00 
_ 0 19 853934 7, paper covers £15.00 





Nature's Capacities and 
their Measurement 


NANCY CARTWRIGHT 


This book challenges the orthodox empiricist 
view that there are no powers and capacities in 
nature. The author es that capacities are 
essential in our scientific world, and, further, 
that they can comply with strict standards of 
testability. 


019 824477 0, 278 pages, figures throughout, 
Clarendon Press £27.50 


Philosophy, Psychia 
and Neuroscience—Three 
Approaches to the Mind 


A Synthetic Analysis of the Varieties of 
Human Experience 

EDWARD M. HUNDERT 

Through a detailed discussion of major themes 
from philosophy, psychology, and neuroscience, 
this book proposes a challenging new view of the 
mind, which integrates insights from all three 
disciplines. 


0 19 824796 6, 360 pages, 24 figures, Clarendon 
Press d £30.00 


Mathematics without 
Numbers 


Towards a Modal-Structural 
Interpretation 


GEOFFREY HELLMAN 


This book works out a detailed interpretation of 

mathematics as the investigation of ‘structural 

possibilities', as opposed to absolute, Platonic 

objects. 

0 19 824934 9, 166 pages, Clarendon Press 
£22.50 





